[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-03-16 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197014#comment-15197014
 ] 

Lefty Leverenz commented on HIVE-12995:
---

Doc note:  This adds configuration parameter 
*hive.orc.splits.allow.synthetic.fileid* to HiveConf.java, so it will need to 
be documented in the ORC section of Configuration Properties for release 2.1.0.

* [Configuration Properties -- ORC File Format | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat]

Should it also be mentioned in the llap documentation?  (The parameter 
description doesn't say anything about llap.)

* [LLAP design document | 
https://issues.apache.org/jira/secure/attachment/12665704/LLAPdesigndocument.pdf]
 attached to HIVE-7926

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-12995.01.patch, HIVE-12995.02.patch, 
> HIVE-12995.03.patch, HIVE-12995.04.patch, HIVE-12995.patch
>
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-03-15 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196077#comment-15196077
 ] 

Hive QA commented on HIVE-12995:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793362/HIVE-12995.04.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9825 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7274/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7274/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7274/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12793362 - PreCommit-HIVE-TRUNK-Build

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12995.01.patch, HIVE-12995.02.patch, 
> HIVE-12995.03.patch, HIVE-12995.04.patch, HIVE-12995.patch
>
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-03-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190796#comment-15190796
 ] 

Hive QA commented on HIVE-12995:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12792173/HIVE-12995.02.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7218/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7218/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7218/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7218/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 62bae5e HIVE-13236 : LLAP: token renewal interval needs to be 
set (Sergey Shelukhin, reviewed by Siddharth Seth)
+ git clean -f -d
Removing common/src/java/org/apache/hadoop/hive/conf/HiveConf.java.orig
Removing 
metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java.orig
Removing ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java.orig
Removing ql/src/java/org/apache/hadoop/hive/ql/io/orc/ExternalCache.java
Removing ql/src/java/org/apache/hadoop/hive/ql/io/orc/LocalCache.java
Removing 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/MetastoreExternalCachesByConf.java
Removing ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java.orig
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 62bae5e HIVE-13236 : LLAP: token renewal interval needs to be 
set (Sergey Shelukhin, reviewed by Siddharth Seth)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12792173 - PreCommit-HIVE-TRUNK-Build

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12995.01.patch, HIVE-12995.02.patch, 
> HIVE-12995.02.patch, HIVE-12995.patch
>
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-03-08 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186344#comment-15186344
 ] 

Gopal V commented on HIVE-12995:


Minor nit on the OrcBatchKey::equals() - fix before commit, please.

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12995.01.patch, HIVE-12995.patch
>
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-03-07 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184257#comment-15184257
 ] 

Gopal V commented on HIVE-12995:


LGTM - +1, pending minor comments left on RB (s/fileId/fileKey/ pretty much)

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12995.01.patch, HIVE-12995.patch
>
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-03-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184024#comment-15184024
 ] 

Sergey Shelukhin commented on HIVE-12995:
-

Test failures are unrelated. 

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12995.01.patch, HIVE-12995.patch
>
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-03-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15181931#comment-15181931
 ] 

Hive QA commented on HIVE-12995:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12791275/HIVE-12995.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9781 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.llap.cache.TestIncrementalObjectSizeEstimator.testMetadata
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7172/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7172/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7172/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12791275 - PreCommit-HIVE-TRUNK-Build

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12995.01.patch, HIVE-12995.patch
>
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-03-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177704#comment-15177704
 ] 

Hive QA commented on HIVE-12995:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12790833/HIVE-12995.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 9736 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver-index_compact_2.q-vector_grouping_sets.q-lateral_view_cp.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-llap_acid.q-binarysortable_1.q-orc_merge5.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_udf_percentile
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_reorder4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_llap_uncompressed
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_llap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_null_check
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_folder_constants
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_1
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.llap.cache.TestIncrementalObjectSizeEstimator.testMetadata
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7151/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7151/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7151/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12790833 - PreCommit-HIVE-TRUNK-Build

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12995.patch
>
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-02-29 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173098#comment-15173098
 ] 

Sergey Shelukhin commented on HIVE-12995:
-

Ok, now I'm working on this btw :)

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-02-09 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15140149#comment-15140149
 ] 

Sergey Shelukhin commented on HIVE-12995:
-

Not working on it yet, just assigned so it would show up in my filters... 

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-02-03 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131710#comment-15131710
 ] 

Gopal V commented on HIVE-12995:


Yeah, we might have to reconsider what happens to inodes once HDFS federation 
enters the picture.

The viewFS model is already somewhat threatened by this.

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-02-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131320#comment-15131320
 ] 

Sergey Shelukhin commented on HIVE-12995:
-

It might be easier to make cache support T as key, then use Long for HDFS and a 
struct for other FSes. It might ever work with multiple FSes per cache easily, 
will definitely work with one.

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)