[jira] [Commented] (HIVE-11139) Emit more lineage information

2015-09-15 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746378#comment-14746378
 ] 

Jimmy Xiang commented on HIVE-11139:


Filed HIVE-11834 to track this issue. Thanks Mark.

> Emit more lineage information
> -
>
> Key: HIVE-11139
> URL: https://issues.apache.org/jira/browse/HIVE-11139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11139.1.patch, HIVE-11139.2.patch, 
> HIVE-11139.3.patch
>
>
> HIVE-1131 emits some column lineage info. But it doesn't support INSERT 
> statements, or CTAS statements. It doesn't emit the predicate information 
> either.
> We can enhance and use the dependency information created in HIVE-1131, 
> generate more complete lineage info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11139) Emit more lineage information

2015-09-15 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745937#comment-14745937
 ] 

Mark Grover commented on HIVE-11139:


I have a dynamic partitioning query but at the end of the query it shows me an 
error message like:
{quote}
ERROR : Result schema has 2 fields, but we don't get as many dependencies
{quote}

Going through the source code, led me to this commit. Was this tested to make 
sure it works fine with dynamic partitioning. Here's my query btw?
{code}
SET hive.exec.dynamic.partition.mode=nonstrict;
DROP TABLE IF EXISTS default.src_mark;
CREATE TABLE default.src_mark (first string, word string)
PARTITIONED BY (length int)
STORED AS PARQUET;
INSERT INTO TABLE default.src_mark PARTITION(length) SELECT first, word, length 
FROM spark_hive.src_flat;
{code}

And, I verified that all the values in src_flat conform to the schema. Also, at 
the very least it would be helpful to know what the number of dependencies and 
what their names were in the error message:
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/hooks/LineageLogger.java#L251

Your thoughts would be much appreciated!

> Emit more lineage information
> -
>
> Key: HIVE-11139
> URL: https://issues.apache.org/jira/browse/HIVE-11139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11139.1.patch, HIVE-11139.2.patch, 
> HIVE-11139.3.patch
>
>
> HIVE-1131 emits some column lineage info. But it doesn't support INSERT 
> statements, or CTAS statements. It doesn't emit the predicate information 
> either.
> We can enhance and use the dependency information created in HIVE-1131, 
> generate more complete lineage info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11139) Emit more lineage information

2015-07-19 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14632922#comment-14632922
 ] 

Prasanth Jayachandran commented on HIVE-11139:
--

[~jxiang] Yes you are right. Setting maxBackupIndex to 0 will truncate the 
file. If the aim of the appender is to not delete files, then how about using 
the normal FileAppender which should never delete/roll the files? If rolling up 
is desired, then recommendation from log4j is to use maxBackupIndex to <10 (for 
performance reasons) and use a high value for maxFileSize (in the order of 
GBs). Similar discussion is here 
https://jazz.net/forum/questions/150960/orgapachelog4jrollingfileappender-maxbackupindex-limit

> Emit more lineage information
> -
>
> Key: HIVE-11139
> URL: https://issues.apache.org/jira/browse/HIVE-11139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11139.1.patch, HIVE-11139.2.patch, 
> HIVE-11139.3.patch
>
>
> HIVE-1131 emits some column lineage info. But it doesn't support INSERT 
> statements, or CTAS statements. It doesn't emit the predicate information 
> either.
> We can enhance and use the dependency information created in HIVE-1131, 
> generate more complete lineage info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11139) Emit more lineage information

2015-07-19 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14632908#comment-14632908
 ] 

Jimmy Xiang commented on HIVE-11139:


[~prasanth_j], this new RFA never renames/deletes a log file. Per the javadoc 
of RFA, it seems that the log file is truncated and no backup file is created 
is maxBackupIndex = 0, right? If we can achieve the same thing with the 
existing RFA, it will be great. If not, we are happy to port it to log4j2 too.

> Emit more lineage information
> -
>
> Key: HIVE-11139
> URL: https://issues.apache.org/jira/browse/HIVE-11139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11139.1.patch, HIVE-11139.2.patch, 
> HIVE-11139.3.patch
>
>
> HIVE-1131 emits some column lineage info. But it doesn't support INSERT 
> statements, or CTAS statements. It doesn't emit the predicate information 
> either.
> We can enhance and use the dependency information created in HIVE-1131, 
> generate more complete lineage info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11139) Emit more lineage information

2015-07-18 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14632655#comment-14632655
 ] 

Prasanth Jayachandran commented on HIVE-11139:
--

Hi [~szehon] and [~jxiang]..

I am working on HIVE-11304 and in the process noticed that this jira added a 
new RFA NoDeleteRollingFileAppender. I am wondering what is the purpose of it? 
If I understand correctly, it doesn't delete the old rollover files under any 
condition. If that's the case, similar behaviour can be obtained by setting the 
maxBackupIndex to negative value or 0 by default in log4j.properties file. 
http://grepcode.com/file/repo1.maven.org/maven2/log4j/log4j/1.2.17/org/apache/log4j/RollingFileAppender.java#141
The delete codepath gets triggered only when maxBackupIndex is > 0 which should 
get you the same behaviour of not deleting at all.
If it serves a different purpose, can you guys please explain it? Its hard to 
port such custom appenders to log4j2, as the APIs are not compatible.

> Emit more lineage information
> -
>
> Key: HIVE-11139
> URL: https://issues.apache.org/jira/browse/HIVE-11139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11139.1.patch, HIVE-11139.2.patch, 
> HIVE-11139.3.patch
>
>
> HIVE-1131 emits some column lineage info. But it doesn't support INSERT 
> statements, or CTAS statements. It doesn't emit the predicate information 
> either.
> We can enhance and use the dependency information created in HIVE-1131, 
> generate more complete lineage info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11139) Emit more lineage information

2015-07-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611676#comment-14611676
 ] 

Hive QA commented on HIVE-11139:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12743197/HIVE-11139.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9134 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4473/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4473/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4473/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12743197 - PreCommit-HIVE-TRUNK-Build

> Emit more lineage information
> -
>
> Key: HIVE-11139
> URL: https://issues.apache.org/jira/browse/HIVE-11139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 2.0.0
>
> Attachments: HIVE-11139.1.patch, HIVE-11139.2.patch, 
> HIVE-11139.3.patch
>
>
> HIVE-1131 emits some column lineage info. But it doesn't support INSERT 
> statements, or CTAS statements. It doesn't emit the predicate information 
> either.
> We can enhance and use the dependency information created in HIVE-1131, 
> generate more complete lineage info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11139) Emit more lineage information

2015-07-01 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611194#comment-14611194
 ] 

Szehon Ho commented on HIVE-11139:
--

+1

> Emit more lineage information
> -
>
> Key: HIVE-11139
> URL: https://issues.apache.org/jira/browse/HIVE-11139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 2.0.0
>
> Attachments: HIVE-11139.1.patch, HIVE-11139.2.patch
>
>
> HIVE-1131 emits some column lineage info. But it doesn't support INSERT 
> statements, or CTAS statements. It doesn't emit the predicate information 
> either.
> We can enhance and use the dependency information created in HIVE-1131, 
> generate more complete lineage info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11139) Emit more lineage information

2015-07-01 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611188#comment-14611188
 ] 

Jimmy Xiang commented on HIVE-11139:


Yeah, v2 is on RB.

> Emit more lineage information
> -
>
> Key: HIVE-11139
> URL: https://issues.apache.org/jira/browse/HIVE-11139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 2.0.0
>
> Attachments: HIVE-11139.1.patch, HIVE-11139.2.patch
>
>
> HIVE-1131 emits some column lineage info. But it doesn't support INSERT 
> statements, or CTAS statements. It doesn't emit the predicate information 
> either.
> We can enhance and use the dependency information created in HIVE-1131, 
> generate more complete lineage info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11139) Emit more lineage information

2015-07-01 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611184#comment-14611184
 ] 

Szehon Ho commented on HIVE-11139:
--

Can you update the review board?

> Emit more lineage information
> -
>
> Key: HIVE-11139
> URL: https://issues.apache.org/jira/browse/HIVE-11139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 2.0.0
>
> Attachments: HIVE-11139.1.patch, HIVE-11139.2.patch
>
>
> HIVE-1131 emits some column lineage info. But it doesn't support INSERT 
> statements, or CTAS statements. It doesn't emit the predicate information 
> either.
> We can enhance and use the dependency information created in HIVE-1131, 
> generate more complete lineage info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11139) Emit more lineage information

2015-07-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611171#comment-14611171
 ] 

Hive QA commented on HIVE-11139:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12743113/HIVE-11139.2.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9135 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ptf
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_ptf
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4466/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4466/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4466/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12743113 - PreCommit-HIVE-TRUNK-Build

> Emit more lineage information
> -
>
> Key: HIVE-11139
> URL: https://issues.apache.org/jira/browse/HIVE-11139
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 2.0.0
>
> Attachments: HIVE-11139.1.patch, HIVE-11139.2.patch
>
>
> HIVE-1131 emits some column lineage info. But it doesn't support INSERT 
> statements, or CTAS statements. It doesn't emit the predicate information 
> either.
> We can enhance and use the dependency information created in HIVE-1131, 
> generate more complete lineage info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11139) Emit more lineage information

2015-06-30 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609129#comment-14609129
 ] 

Hive QA commented on HIVE-11139:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742613/HIVE-11139.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4448/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4448/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4448/

Messages:
{noformat}
 This message was trimmed, see log for full details 
 [copy] Copying 11 files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
spark-client ---
[INFO] Compiling 5 source files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client ---
[INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar
[INFO] Copying guava-14.0.1.jar to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
spark-client ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar
 to 
/home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to 
/home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Query Language 2.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec ---
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-exec ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
Generating vector expression code
Generating vector expression test code
[INFO] Executed tasks
[INFO] 
[INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec ---
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java
 added.
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean
 added.
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java
 added.
[INFO] 
[INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec ---
[INFO] ANTLR: Processing source directory 
/data/hive-ptest/working/apache-github-source-source/ql/src/java
ANTLR Parser Generator  Version 3.4
org/apache/hadoop/hive/ql/parse/HiveLexer.g
org/apache/hadoop/hive/ql/parse/HiveParser.g
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_MAP" using 
multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that input
warning(200): IdentifiersParser.g:455:5: 
Decision can match input such as "{KW_REGEXP, KW_RLIKE} KW_UNION KW_SELECT" 
using multiple alternatives: 2, 9

As a result, alternative(s) 9 were disabled for that