[jira] Commented: (HIVE-809) Create a copier to copy data from scribe hdfs cluster to main DW cluster
[ https://issues.apache.org/jira/browse/HIVE-809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793704#action_12793704 ] Suresh Antony commented on HIVE-809: I am on vacation from 12/11/09 to 1/5/10. > Create a copier to copy data from scribe hdfs cluster to main DW cluster > > > Key: HIVE-809 > URL: https://issues.apache.org/jira/browse/HIVE-809 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Attachments: patch_809_1.txt > > > Currently we have scribe hdfs, which write scribe data directly to HDFS > cluster. But in most cases this cluster will not be used for accessing the > data. > This data needs to copied to cluster from which you can access this scribe > using hive or some other tool. > This copier should be able to copy large amounts of data on a new realtime > bases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-809) Create a copier to copy data from scribe hdfs cluster to main DW cluster
[ https://issues.apache.org/jira/browse/HIVE-809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-809: --- Status: Patch Available (was: Open) Submitted patch for scribehdfs to main hdfs copier. > Create a copier to copy data from scribe hdfs cluster to main DW cluster > > > Key: HIVE-809 > URL: https://issues.apache.org/jira/browse/HIVE-809 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.4.0 > > Attachments: patch_809_1.txt > > > Currently we have scribe hdfs, which write scribe data directly to HDFS > cluster. But in most cases this cluster will not be used for accessing the > data. > This data needs to copied to cluster from which you can access this scribe > using hive or some other tool. > This copier should be able to copy large amounts of data on a new realtime > bases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-809) Create a copier to copy data from scribe hdfs cluster to main DW cluster
[ https://issues.apache.org/jira/browse/HIVE-809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-809: --- Attachment: patch_809_1.txt patch for scribe data copier. > Create a copier to copy data from scribe hdfs cluster to main DW cluster > > > Key: HIVE-809 > URL: https://issues.apache.org/jira/browse/HIVE-809 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.4.0 > > Attachments: patch_809_1.txt > > > Currently we have scribe hdfs, which write scribe data directly to HDFS > cluster. But in most cases this cluster will not be used for accessing the > data. > This data needs to copied to cluster from which you can access this scribe > using hive or some other tool. > This copier should be able to copy large amounts of data on a new realtime > bases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-809) Create a copier to copy data from scribe hdfs cluster to main DW cluster
Create a copier to copy data from scribe hdfs cluster to main DW cluster Key: HIVE-809 URL: https://issues.apache.org/jira/browse/HIVE-809 Project: Hadoop Hive Issue Type: New Feature Reporter: Suresh Antony Assignee: Suresh Antony Priority: Minor Fix For: 0.4.0 Currently we have scribe hdfs, which write scribe data directly to HDFS cluster. But in most cases this cluster will not be used for accessing the data. This data needs to copied to cluster from which you can access this scribe using hive or some other tool. This copier should be able to copy large amounts of data on a new realtime bases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-563) UDF for parsing the URL
[ https://issues.apache.org/jira/browse/HIVE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-563: --- Attachment: patch_563.txt.2 Removed String.split() -- Added second eventuate method where user can specify 'Query' key as separate argument. eg:- parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'QUERY', 'k2') , parse_url('http://facebook.com/path1/p.php?k1=v1&k2=v2#Ref1', 'QUERY', 'k1') , > UDF for parsing the URL > --- > > Key: HIVE-563 > URL: https://issues.apache.org/jira/browse/HIVE-563 > Project: Hadoop Hive > Issue Type: New Feature > Components: Server Infrastructure >Reporter: Suresh Antony >Assignee: Suresh Antony > Attachments: patch_563.txt, patch_563.txt.1, patch_563.txt.2 > > > Needs a udf to extract the parts of url from url string. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-563) UDF for parsing the URL
[ https://issues.apache.org/jira/browse/HIVE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-563: --- Attachment: patch_563.txt.1 * UDF to extract specific parts from URL * parse_url('http://facebook.com/path/p1.php?query=1', 'HOST') will return 'facebook.com' * parse_url('http://facebook.com/path/p1.php?query=1', 'PATH') will return '/path/p1.php' * parse_url('http://facebook.com/path/p1.php?query=1', 'QUERY') will return 'query=1' * parse_url('http://facebook.com/path/p1.php?query=1#Ref', 'REF') will return 'Ref' * parse_url('http://facebook.com/path/p1.php?query=1#Ref', 'PROTOCOL') will return 'http' * Possible values are HOST,PATH,QUERY,REF,PROTOCOL,AUTHORITY,FILE,USERINFO * Also you can get a value of particular key in QUERY, using syntax QUERY: eg: QUERY:k1. > UDF for parsing the URL > --- > > Key: HIVE-563 > URL: https://issues.apache.org/jira/browse/HIVE-563 > Project: Hadoop Hive > Issue Type: New Feature > Components: Server Infrastructure >Reporter: Suresh Antony >Assignee: Suresh Antony > Attachments: patch_563.txt, patch_563.txt.1 > > > Needs a udf to extract the parts of url from url string. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-563) UDF for parsing the URL
[ https://issues.apache.org/jira/browse/HIVE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-563: --- Attachment: patch_563.txt.1 UDF to extract specific parts from URL parse_url('http://facebook.com/path/p1.php?query=1', 'HOST') will return 'facebook.com' parse_url('http://facebook.com/path/p1.php?query=1', 'PATH') will return '/path/p1.php' parse_url('http://facebook.com/path/p1.php?query=1', 'QUERY') will return 'query=1' parse_url('http://facebook.com/path/p1.php?query=1#Ref', 'REF') will return 'Ref' parse_url('http://facebook.com/path/p1.php?query=1#Ref', 'PROTOCOL') will return 'http' Possible values are HOST,PATH,QUERY,REF,PROTOCOL,AUTHORITY,FILE,USERINFO Also you can get a value of particular key in QUERY, using syntax QUERY: eg: QUERY:k1. > UDF for parsing the URL > --- > > Key: HIVE-563 > URL: https://issues.apache.org/jira/browse/HIVE-563 > Project: Hadoop Hive > Issue Type: New Feature > Components: Server Infrastructure >Reporter: Suresh Antony >Assignee: Suresh Antony > Attachments: patch_563.txt > > > Needs a udf to extract the parts of url from url string. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-563) UDF for parsing the URL
[ https://issues.apache.org/jira/browse/HIVE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-563: --- Attachment: (was: patch_563.txt.1) > UDF for parsing the URL > --- > > Key: HIVE-563 > URL: https://issues.apache.org/jira/browse/HIVE-563 > Project: Hadoop Hive > Issue Type: New Feature > Components: Server Infrastructure >Reporter: Suresh Antony >Assignee: Suresh Antony > Attachments: patch_563.txt > > > Needs a udf to extract the parts of url from url string. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-563) UDF for parsing the URL
[ https://issues.apache.org/jira/browse/HIVE-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-563: --- Attachment: patch_563.txt parse_url -- udf Format; pasre_url( utl, URL_PART_NAME). Possible url Parts are: HOST,PATH,QUERY,REF,PROTOCOL,AUTHORITY,FILE,USERINFO example: parse_url('http://facebook.com/path/p1.php?query=1', 'HOST') will return 'facebook.com' parse_url('http://facebook.com/path/p1.php?query=1', 'PATH') will return 'path/p1.php' Definition of parts can be obtained from: http://www.j2ee.me/j2se/1.4.2/docs/api/java/net/URL.html > UDF for parsing the URL > --- > > Key: HIVE-563 > URL: https://issues.apache.org/jira/browse/HIVE-563 > Project: Hadoop Hive > Issue Type: New Feature > Components: Server Infrastructure >Reporter: Suresh Antony >Assignee: Suresh Antony > Attachments: patch_563.txt > > > Needs a udf to extract the parts of url from url string. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-563) UDF for parsing the URL
UDF for parsing the URL --- Key: HIVE-563 URL: https://issues.apache.org/jira/browse/HIVE-563 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Suresh Antony Assignee: Suresh Antony Needs a udf to extract the parts of url from url string. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-376) In strict mode do not allow join without "ON" condition
[ https://issues.apache.org/jira/browse/HIVE-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-376: --- Component/s: Configuration Priority: Minor (was: Major) Affects Version/s: 0.4.0 Fix Version/s: 0.4.0 Issue Type: New Feature (was: Bug) > In strict mode do not allow join without "ON" condition > > > Key: HIVE-376 > URL: https://issues.apache.org/jira/browse/HIVE-376 > Project: Hadoop Hive > Issue Type: New Feature > Components: Configuration >Affects Versions: 0.4.0 >Reporter: Suresh Antony >Priority: Minor > Fix For: 0.4.0 > > > In strict mode do not allow join without "ON" condition. This will result in > cartition product and explosion of data. Very few people want to run with > join condition. Usually it is a mistake. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-376) In strict mode do not allow join without "ON" condition
In strict mode do not allow join without "ON" condition Key: HIVE-376 URL: https://issues.apache.org/jira/browse/HIVE-376 Project: Hadoop Hive Issue Type: Bug Reporter: Suresh Antony In strict mode do not allow join without "ON" condition. This will result in cartition product and explosion of data. Very few people want to run with join condition. Usually it is a mistake. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-349) HiveHistory: TestCLiDriver fails if there are test cases with no tasks
[ https://issues.apache.org/jira/browse/HIVE-349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-349: --- Attachment: patch_349_1.txt > HiveHistory: TestCLiDriver fails if there are test cases with no tasks > --- > > Key: HIVE-349 > URL: https://issues.apache.org/jira/browse/HIVE-349 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.3.0 >Reporter: Suresh Antony >Assignee: Suresh Antony > Fix For: 0.3.0 > > Attachments: patch_349_1.txt > > > TestCLIDriver Fails for some test cases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-349) HiveHistory: TestCLiDriver fails if there are test cases with no tasks
[ https://issues.apache.org/jira/browse/HIVE-349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-349: --- Summary: HiveHistory: TestCLiDriver fails if there are test cases with no tasks (was: HiveHistory: TestCLiDriver fails if there are test cases with tasks) > HiveHistory: TestCLiDriver fails if there are test cases with no tasks > --- > > Key: HIVE-349 > URL: https://issues.apache.org/jira/browse/HIVE-349 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.3.0 >Reporter: Suresh Antony >Assignee: Suresh Antony > Fix For: 0.3.0 > > > TestCLIDriver Fails for some test cases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-349) HiveHistory: TestCLiDriver fails if there are test cases with tasks
[ https://issues.apache.org/jira/browse/HIVE-349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony reassigned HIVE-349: -- Assignee: Suresh Antony > HiveHistory: TestCLiDriver fails if there are test cases with tasks > > > Key: HIVE-349 > URL: https://issues.apache.org/jira/browse/HIVE-349 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.3.0 >Reporter: Suresh Antony >Assignee: Suresh Antony > Fix For: 0.3.0 > > > TestCLIDriver Fails for some test cases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-349) HiveHistory: TestCLiDriver fails if there are test cases with tasks
HiveHistory: TestCLiDriver fails if there are test cases with tasks Key: HIVE-349 URL: https://issues.apache.org/jira/browse/HIVE-349 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.3.0 Reporter: Suresh Antony Fix For: 0.3.0 TestCLIDriver Fails for some test cases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-327) row count getting printed wrongly
[ https://issues.apache.org/jira/browse/HIVE-327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-327: --- Attachment: patch_327_1.txt Fixed problem. RowCount hash map is now associated with queryid > row count getting printed wrongly > - > > Key: HIVE-327 > URL: https://issues.apache.org/jira/browse/HIVE-327 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.2.0 >Reporter: Suresh Antony >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_327_1.txt > > > When multiple queries are executed in same session, row count of the first > query is getting printed for subsequent queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-327) row count getting printed wrongly
row count getting printed wrongly - Key: HIVE-327 URL: https://issues.apache.org/jira/browse/HIVE-327 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.2.0 Reporter: Suresh Antony Assignee: Suresh Antony Fix For: 0.2.0 When multiple queries are executed in same session, row count of the first query is getting printed for subsequent queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-79) Print number of rows inserted to table(s) when the query is finished.
[ https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-79: -- Attachment: patch_79_5.txt Test failed yesterday because a new test was added. The patch changes plan for filesink operator. A new test case will fail the unit test. > Print number of rows inserted to table(s) when the query is finished. > -- > > Key: HIVE-79 > URL: https://issues.apache.org/jira/browse/HIVE-79 > Project: Hadoop Hive > Issue Type: New Feature > Components: Logging >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > Attachments: patch_79_1.txt, patch_79_2.txt, patch_79_3.txt, > patch_79_4.txt, patch_79_5.txt > > > It is good to print the number of rows inserted into each table at end of > query. > insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10; > This query can print something like: > tab1 rows=100 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-79) Print number of rows inserted to table(s) when the query is finished.
[ https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-79: -- Attachment: patch_79_4.txt Resolved the conflicts > Print number of rows inserted to table(s) when the query is finished. > -- > > Key: HIVE-79 > URL: https://issues.apache.org/jira/browse/HIVE-79 > Project: Hadoop Hive > Issue Type: New Feature > Components: Logging >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > Attachments: patch_79_1.txt, patch_79_2.txt, patch_79_3.txt, > patch_79_4.txt > > > It is good to print the number of rows inserted into each table at end of > query. > insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10; > This query can print something like: > tab1 rows=100 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-79) Print number of rows inserted to table(s) when the query is finished.
[ https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674327#action_12674327 ] Suresh Antony commented on HIVE-79: --- if so can I commit this patch. > Print number of rows inserted to table(s) when the query is finished. > -- > > Key: HIVE-79 > URL: https://issues.apache.org/jira/browse/HIVE-79 > Project: Hadoop Hive > Issue Type: New Feature > Components: Logging >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > Attachments: patch_79_1.txt, patch_79_2.txt, patch_79_3.txt > > > It is good to print the number of rows inserted into each table at end of > query. > insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10; > This query can print something like: > tab1 rows=100 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-79) Print number of rows inserted to table(s) when the query is finished.
[ https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673919#action_12673919 ] Suresh Antony commented on HIVE-79: --- Trying to add test case ... We run our test cases in local mode. Looks like there are no counters getting created in local mode. Need to investigate more. > Print number of rows inserted to table(s) when the query is finished. > -- > > Key: HIVE-79 > URL: https://issues.apache.org/jira/browse/HIVE-79 > Project: Hadoop Hive > Issue Type: New Feature > Components: Logging >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > Attachments: patch_79_1.txt, patch_79_2.txt, patch_79_3.txt > > > It is good to print the number of rows inserted into each table at end of > query. > insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10; > This query can print something like: > tab1 rows=100 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-79) Print number of rows inserted to table(s) when the query is finished.
[ https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-79: -- Attachment: patch_79_3.txt 1. Added printing row count to output 2. Also added row count to Query hisory 3. Changing the the test outputs because fileSinkDesc changed. Added table Id to file Sink Descriptor. > Print number of rows inserted to table(s) when the query is finished. > -- > > Key: HIVE-79 > URL: https://issues.apache.org/jira/browse/HIVE-79 > Project: Hadoop Hive > Issue Type: New Feature > Components: Logging >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > Attachments: patch_79_1.txt, patch_79_2.txt, patch_79_3.txt > > > It is good to print the number of rows inserted into each table at end of > query. > insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10; > This query can print something like: > tab1 rows=100 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-79) Print number of raws inserted to table(s) when the query is finished.
[ https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-79: -- Attachment: patch_79_2.txt Implemented the feedback by namit. Reseting the id and map in the reset() call. > Print number of raws inserted to table(s) when the query is finished. > -- > > Key: HIVE-79 > URL: https://issues.apache.org/jira/browse/HIVE-79 > Project: Hadoop Hive > Issue Type: New Feature > Components: Logging >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > Attachments: patch_79_1.txt, patch_79_2.txt > > > It is good to print the number of rows inserted into each table at end of > query. > insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10; > This query can print something like: > tab1 rows=100 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-79) Print number of raws inserted to table(s) when the query is finished.
[ https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-79: -- Attachment: patch_79_1.txt This path logs inserted row count to hive query log. Logged format will be: TaskEnd TASK_ROWS_INSERTED="tmp_suresh_12:181687,tmp_suresh_13:181687" Made changes to semantic analyzer keep tack id-table name map. HiveHistory converts id back to table name and writes to structured query log. > Print number of raws inserted to table(s) when the query is finished. > -- > > Key: HIVE-79 > URL: https://issues.apache.org/jira/browse/HIVE-79 > Project: Hadoop Hive > Issue Type: New Feature > Components: Logging >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > Attachments: patch_79_1.txt > > > It is good to print the number of rows inserted into each table at end of > query. > insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10; > This query can print something like: > tab1 rows=100 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-79) Print number of raws inserted to table(s) when the query is finished.
[ https://issues.apache.org/jira/browse/HIVE-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony reassigned HIVE-79: - Assignee: Suresh Antony > Print number of raws inserted to table(s) when the query is finished. > -- > > Key: HIVE-79 > URL: https://issues.apache.org/jira/browse/HIVE-79 > Project: Hadoop Hive > Issue Type: New Feature > Components: Logging >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > > It is good to print the number of rows inserted into each table at end of > query. > insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10; > This query can print something like: > tab1 rows=100 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-257) Put the structed hhive query location in hive-site.xml
[ https://issues.apache.org/jira/browse/HIVE-257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-257: --- Attachment: patch_257_2.txt Last patch contained all the changes... > Put the structed hhive query location in hive-site.xml > -- > > Key: HIVE-257 > URL: https://issues.apache.org/jira/browse/HIVE-257 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.2.0 >Reporter: Suresh Antony >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_257.txt, patch_257_2.txt > > > Put the structed hhive query location in hive-site.xml. Also chnage name of > query log to add a random integer to file name. So that mutiple session do > not overwrite same file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-257) Put the structed hhive query location in hive-site.xml
[ https://issues.apache.org/jira/browse/HIVE-257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-257: --- Attachment: patch_257.txt > Put the structed hhive query location in hive-site.xml > -- > > Key: HIVE-257 > URL: https://issues.apache.org/jira/browse/HIVE-257 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.2.0 >Reporter: Suresh Antony >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_257.txt > > > Put the structed hhive query location in hive-site.xml. Also chnage name of > query log to add a random integer to file name. So that mutiple session do > not overwrite same file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-257) Put the structed hhive query location in hive-site.xml
Put the structed hhive query location in hive-site.xml -- Key: HIVE-257 URL: https://issues.apache.org/jira/browse/HIVE-257 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.2.0 Reporter: Suresh Antony Assignee: Suresh Antony Fix For: 0.2.0 Put the structed hhive query location in hive-site.xml. Also chnage name of query log to add a random integer to file name. So that mutiple session do not overwrite same file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-176: --- Attachment: patch_176_2.txt added hive.querylog.location > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.2.0 >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_176.txt, patch_176.txt, patch_176.txt, > patch_176.txt, patch_176.txt, patch_176.txt, patch_176.txt, patch_176_2.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667944#action_12667944 ] Suresh Antony commented on HIVE-176: Create hive job history directory, if the directory does not exists/ > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.2.0 >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_176.txt, patch_176.txt, patch_176.txt, > patch_176.txt, patch_176.txt, patch_176.txt, patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-176: --- Attachment: patch_176.txt > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.2.0 >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_176.txt, patch_176.txt, patch_176.txt, > patch_176.txt, patch_176.txt, patch_176.txt, patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-176: --- Attachment: patch_176.txt Made following changes: Added "TIME" key to every line. It will have the value of "System.currentTimeMillis()" Also Added QueryId instead of using Query string as the query id. Added new conf variable "hive.query.id" > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.2.0 >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_176.txt, patch_176.txt, patch_176.txt, > patch_176.txt, patch_176.txt, patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665148#action_12665148 ] Suresh Antony commented on HIVE-176: Changed hive.joblog.location to hive.querylog.location Also changed e.PrintStacktrace() Changed CliDriver.. Moved SessionState.start() after conf variable initialization. Otherwise conf setting to change hive.querylog.location was having no effect. Since HiveHistory was getting initialized even before hiveconf was parsed. Thanks for all the feedbacks. > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.2.0 >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_176.txt, patch_176.txt, patch_176.txt, > patch_176.txt, patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-176: --- Attachment: patch_176.txt > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.2.0 >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_176.txt, patch_176.txt, patch_176.txt, > patch_176.txt, patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664204#action_12664204 ] Suresh Antony commented on HIVE-176: * inferNumReducers(): instead of two calls to the hivehistory - can just make one call at the end of the function when the numReducers has been set for sure. We could also set NUM_REDUCERS to 0 when no reducer is specified (more informative imho). made it single call after this function call * I still don't see why HAS_REDUCE_TASKS and NUM_REDUCE_TASKS are meaningful counters. what is the use case? --- Removed both of these variables * In TestHiveHistory - please use setup() method or constructor to do initialization. also a negative test case would be good (to check if negative job status is being captured for example). --- moved this code to setUp() * HiveHistoryViewer - indentation is badly off. I think we are following a general convention of '} else {' as well (and curly braces on same like as function/class declaration - viz 'void init() {'. --- Re-formtted using eclipse formatter * JOB_STATUS and TASK_STATUS are both unused. * i couldn't understand this code block in parseHiveHistory: + if (!line.trim().endsWith("\"")){ + continue; + } can u explain. --- Format is key="value"... so the value line does not end with " means value has a newline * parseLine: confused that we have a reg ex group for the key - but are not using it .. seems weird - if u had groups for both key and value u wouldn't need to split. alternately u can rely on just the split. -- cut and pasted this code From JobHistory Parser * getHiveHistory - i don't think it's a good idea to initialize hivehistory object on demand: a) u always need it b) it prints stuff to the console (log file location). if u want a deterministic location for this log - we should just initialize hivehistory at session initialization so that the log file location always comes at the beginning of the session (and not at some random point when the code actually requires it) -- moved hiveHistory initialization to constructor of sessionSate * it would be good to have an example of the hive history file/format checked in somewhere with a pointer to it from the documentation (either in README or wiki). --- Put short summary about the HistoryLog in internal wiki. http://www.intern.facebook.com/intern/wiki/index.php/HiveQueryLog * another easy and comprehensive test to add is in TestCliDriver. This is generated code that fires a bunch of queries - we should be easily able to use HiveHistoryViewer to assert that query status is successful for all queries in positive tests. --- Added hiveHistory Check TestCliDriver. For this to work QTestUtil. SessionState is constructed in the constructor of QTestUtil. Not sure this is correct way or not -- Changed TestCliDriver.vm to check history File. One thing i am concerned about overall is the use of the term 'job' for what is essentially a hive query. I think this creates a lot of room for confusion - since in the hadoop ecosystem job means hadoop job. (we have also overloaded the word task in Hive - which is unfortunate - but almost too late now). If possible - i would really appreciate if we could replace 'job' with 'query' whereever applicable. (s/startJob/startQuery/ for example). --- Changed all Job referces to Query -- should we create the history file always, history will be disabled by default and enbaled setting a jobconf parameter. 'enable.job.history' > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.2.0 >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_176.txt, patch_176.txt, patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-176: --- Attachment: patch_176.txt > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.2.0 >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_176.txt, patch_176.txt, patch_176.txt, > patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-176: --- Attachment: patch_176.txt Fix review comments & added a simple test case. > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.2.0 >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_176.txt, patch_176.txt, patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660868#action_12660868 ] Suresh Antony commented on HIVE-176: attached a patch with following changes: 1. HiveHistory stored in sessionState variable. HiveHistory is created on the first call to job log history. 2. Removed all references to HiveHistory from CliDriver 3. Added a new config variable "hive.joblog.location" for the location of history file. Not using scratch directory as the default location because We are removing scratch directory at the end of run. > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.2.0 >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_176.txt, patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-176: --- Attachment: patch_176.txt > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.2.0 >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_176.txt, patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-202) LINEAGE is not working for join quries
[ https://issues.apache.org/jira/browse/HIVE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-202: --- Attachment: patch_202.txt New patch with test case included. > LINEAGE is not working for join quries > --- > > Key: HIVE-202 > URL: https://issues.apache.org/jira/browse/HIVE-202 > Project: Hadoop Hive > Issue Type: Bug > Components: Clients >Affects Versions: 0.2.0 > Environment: lineage is not working for join quires >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > Attachments: patch_202.txt, patch_202.txt > > > lineage is not giving input tables in case of join quires. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-202) LINEAGE is not working for join quries
[ https://issues.apache.org/jira/browse/HIVE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660125#action_12660125 ] Suresh Antony commented on HIVE-202: Added a new patch with a test case included. > LINEAGE is not working for join quries > --- > > Key: HIVE-202 > URL: https://issues.apache.org/jira/browse/HIVE-202 > Project: Hadoop Hive > Issue Type: Bug > Components: Clients >Affects Versions: 0.2.0 > Environment: lineage is not working for join quires >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > Attachments: patch_202.txt, patch_202.txt > > > lineage is not giving input tables in case of join quires. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-202) LINEAGE is not working for join quries
[ https://issues.apache.org/jira/browse/HIVE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660010#action_12660010 ] Suresh Antony commented on HIVE-202: This query is already there in the test... FROM ( FROM ( FROM src1 src1 SELECT src1.key AS c1, src1.value AS c2 WHERE src1.key > 10 and src1.key < 20) a RIGHT OUTER JOIN ( FROM src2 src2 SELECT src2.key AS c3, src2.value AS c4 WHERE src2.key > 15 and src2.key < 25) b ON (a.c1 = b.c3) SELECT a.c1 AS c1, a.c2 AS c2, b.c3 AS c3, b.c4 AS c4) c SELECT c.c1, c.c2, c.c3, c.c4 > LINEAGE is not working for join quries > --- > > Key: HIVE-202 > URL: https://issues.apache.org/jira/browse/HIVE-202 > Project: Hadoop Hive > Issue Type: Bug > Components: Clients >Affects Versions: 0.2.0 > Environment: lineage is not working for join quires >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > Attachments: patch_202.txt > > > lineage is not giving input tables in case of join quires. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-202) LINEAGE is not working for join quries
[ https://issues.apache.org/jira/browse/HIVE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-202: --- Attachment: patch_202.txt > LINEAGE is not working for join quries > --- > > Key: HIVE-202 > URL: https://issues.apache.org/jira/browse/HIVE-202 > Project: Hadoop Hive > Issue Type: Bug > Components: Clients >Affects Versions: 0.2.0 > Environment: lineage is not working for join quires >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > Attachments: patch_202.txt > > > lineage is not giving input tables in case of join quires. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-202) LINEAGE is not working for join quries
[ https://issues.apache.org/jira/browse/HIVE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony reassigned HIVE-202: -- Assignee: Suresh Antony > LINEAGE is not working for join quries > --- > > Key: HIVE-202 > URL: https://issues.apache.org/jira/browse/HIVE-202 > Project: Hadoop Hive > Issue Type: Bug > Components: Clients >Affects Versions: 0.2.0 > Environment: lineage is not working for join quires >Reporter: Suresh Antony >Assignee: Suresh Antony >Priority: Minor > Fix For: 0.2.0 > > > lineage is not giving input tables in case of join quires. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-202) LINEAGE is not working for join quries
LINEAGE is not working for join quries --- Key: HIVE-202 URL: https://issues.apache.org/jira/browse/HIVE-202 Project: Hadoop Hive Issue Type: Bug Components: Clients Affects Versions: 0.2.0 Environment: lineage is not working for join quires Reporter: Suresh Antony Priority: Minor Fix For: 0.2.0 lineage is not giving input tables in case of join quires. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-176: --- Fix Version/s: 0.2.0 Affects Version/s: 0.2.0 Release Note: Hive History Logging Status: Patch Available (was: Open) > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Affects Versions: 0.2.0 >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-176: --- Attachment: patch_176.txt logging hive history Files changed: 1. Driver.java 2. CliDriver.java 3. ExecDriver.java > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Reporter: Joydeep Sen Sarma > Attachments: patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony reassigned HIVE-176: -- Assignee: Suresh Antony > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Reporter: Joydeep Sen Sarma >Assignee: Suresh Antony > Attachments: patch_176.txt > > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-148) extend bin/hive to include the lineage tool
[ https://issues.apache.org/jira/browse/HIVE-148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-148: --- Attachment: patch_148.txt Adding lineage service to hive binary. > extend bin/hive to include the lineage tool > > > Key: HIVE-148 > URL: https://issues.apache.org/jira/browse/HIVE-148 > Project: Hadoop Hive > Issue Type: New Feature > Components: Clients >Affects Versions: 0.2.0 >Reporter: Suresh Antony >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_148.txt > > > biin/hive currently used only to execute the query. Add options to bin/hive > to output the lineage info given the query as the input. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-148) extend bin/hive to include the lineage tool
[ https://issues.apache.org/jira/browse/HIVE-148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-148: --- Fix Version/s: 0.2.0 Affects Version/s: 0.2.0 Release Note: Adding lineage service extension to hive binary Status: Patch Available (was: Open) Adding lineage service extension to hive binary > extend bin/hive to include the lineage tool > > > Key: HIVE-148 > URL: https://issues.apache.org/jira/browse/HIVE-148 > Project: Hadoop Hive > Issue Type: New Feature > Components: Clients >Affects Versions: 0.2.0 >Reporter: Suresh Antony >Assignee: Suresh Antony > Fix For: 0.2.0 > > Attachments: patch_148.txt > > > biin/hive currently used only to execute the query. Add options to bin/hive > to output the lineage info given the query as the input. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-176) structured log for obtaining query stats/info
[ https://issues.apache.org/jira/browse/HIVE-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656855#action_12656855 ] Suresh Antony commented on HIVE-176: Some thoughts about logging: - Thinking of following the same format as hadoop logging for the jobs - each query will have separate file. location of the file will be printed to stdout while running the query. - Format of the entries in file will be of the following. id=value key1=value1 key2=value2 . - Different entry types can be: -QueryStart -QueryEnd -QueryStepStart -QueryStepEnd -QueryStepProgress > structured log for obtaining query stats/info > - > > Key: HIVE-176 > URL: https://issues.apache.org/jira/browse/HIVE-176 > Project: Hadoop Hive > Issue Type: Bug > Components: Logging >Reporter: Joydeep Sen Sarma > > Josh wrote: > When launching off hive queries using hive -e is there a way to get the job > id so that I can just queue them up and go check their statuses later? What's > the general pattern for queueing and monitoring without using the libraries > directly? > I'm gonna throw my vote in for a structured log format. Users could tail it > and use whatever queuing or monitoring they wish. It's also probably just a > 30 minute project for someone already familiar with the code. I suggest ^A > seperated key=value pairs per log line. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-147) Need a tool for extracting lineage info from hive sql
[ https://issues.apache.org/jira/browse/HIVE-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-147: --- Status: Patch Available (was: Open) > Need a tool for extracting lineage info from hive sql > - > > Key: HIVE-147 > URL: https://issues.apache.org/jira/browse/HIVE-147 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Suresh Antony >Assignee: Suresh Antony > Attachments: patch_147.txt, patch_147.txt > > > Need a tool to extract the line information from hive query. > This tool should take hive query as input and it should output, input and > output tables used by the query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-147) Need a tool for extracting lineage info from hive sql
[ https://issues.apache.org/jira/browse/HIVE-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-147: --- Attachment: patch_147.txt Adding incense text for new files added. > Need a tool for extracting lineage info from hive sql > - > > Key: HIVE-147 > URL: https://issues.apache.org/jira/browse/HIVE-147 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Suresh Antony >Assignee: Suresh Antony > Attachments: patch_147.txt, patch_147.txt > > > Need a tool to extract the line information from hive query. > This tool should take hive query as input and it should output, input and > output tables used by the query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-147) Need a tool for extracting lineage info from hive sql
[ https://issues.apache.org/jira/browse/HIVE-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony updated HIVE-147: --- Attachment: patch_147.txt LineageInfo.java prints input/output tables given the query. It uses new tree walker architecture. > Need a tool for extracting lineage info from hive sql > - > > Key: HIVE-147 > URL: https://issues.apache.org/jira/browse/HIVE-147 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Suresh Antony >Assignee: Suresh Antony > Attachments: patch_147.txt > > > Need a tool to extract the line information from hive query. > This tool should take hive query as input and it should output, input and > output tables used by the query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-148) extend bin/hive to include the lineage tool
extend bin/hive to include the lineage tool Key: HIVE-148 URL: https://issues.apache.org/jira/browse/HIVE-148 Project: Hadoop Hive Issue Type: New Feature Components: Clients Reporter: Suresh Antony Assignee: Suresh Antony biin/hive currently used only to execute the query. Add options to bin/hive to output the lineage info given the query as the input. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-147) Need a tool for extracting lineage info from hive sql
Need a tool for extracting lineage info from hive sql - Key: HIVE-147 URL: https://issues.apache.org/jira/browse/HIVE-147 Project: Hadoop Hive Issue Type: New Feature Reporter: Suresh Antony Need a tool to extract the line information from hive query. This tool should take hive query as input and it should output, input and output tables used by the query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-147) Need a tool for extracting lineage info from hive sql
[ https://issues.apache.org/jira/browse/HIVE-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Antony reassigned HIVE-147: -- Assignee: Suresh Antony > Need a tool for extracting lineage info from hive sql > - > > Key: HIVE-147 > URL: https://issues.apache.org/jira/browse/HIVE-147 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Suresh Antony >Assignee: Suresh Antony > > Need a tool to extract the line information from hive query. > This tool should take hive query as input and it should output, input and > output tables used by the query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-79) Print number of raws inserted to table(s) when the query is finished.
Print number of raws inserted to table(s) when the query is finished. -- Key: HIVE-79 URL: https://issues.apache.org/jira/browse/HIVE-79 Project: Hadoop Hive Issue Type: New Feature Affects Versions: 0.19.0 Reporter: Suresh Antony Priority: Minor Fix For: 0.19.0 It is good to print the number of rows inserted into each table at end of query. insert overwrite table tab1 select a.* from tab2 a where a.col1 = 10; This query can print something like: tab1 rows=100 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.