[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197014#comment-15197014 ] Lefty Leverenz commented on HIVE-12995: --- Doc note: This adds configuration parameter *hive.orc.splits.allow.synthetic.fileid* to HiveConf.java, so it will need to be documented in the ORC section of Configuration Properties for release 2.1.0. * [Configuration Properties -- ORC File Format | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat] Should it also be mentioned in the llap documentation? (The parameter description doesn't say anything about llap.) * [LLAP design document | https://issues.apache.org/jira/secure/attachment/12665704/LLAPdesigndocument.pdf] attached to HIVE-7926 > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Labels: TODOC2.1 > Fix For: 2.1.0 > > Attachments: HIVE-12995.01.patch, HIVE-12995.02.patch, > HIVE-12995.03.patch, HIVE-12995.04.patch, HIVE-12995.patch > > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196077#comment-15196077 ] Hive QA commented on HIVE-12995: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12793362/HIVE-12995.04.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9825 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7274/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7274/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7274/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12793362 - PreCommit-HIVE-TRUNK-Build > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-12995.01.patch, HIVE-12995.02.patch, > HIVE-12995.03.patch, HIVE-12995.04.patch, HIVE-12995.patch > > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190796#comment-15190796 ] Hive QA commented on HIVE-12995: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12792173/HIVE-12995.02.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7218/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7218/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7218/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7218/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 62bae5e HIVE-13236 : LLAP: token renewal interval needs to be set (Sergey Shelukhin, reviewed by Siddharth Seth) + git clean -f -d Removing common/src/java/org/apache/hadoop/hive/conf/HiveConf.java.orig Removing metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java.orig Removing ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java.orig Removing ql/src/java/org/apache/hadoop/hive/ql/io/orc/ExternalCache.java Removing ql/src/java/org/apache/hadoop/hive/ql/io/orc/LocalCache.java Removing ql/src/java/org/apache/hadoop/hive/ql/io/orc/MetastoreExternalCachesByConf.java Removing ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java.orig + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at 62bae5e HIVE-13236 : LLAP: token renewal interval needs to be set (Sergey Shelukhin, reviewed by Siddharth Seth) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12792173 - PreCommit-HIVE-TRUNK-Build > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-12995.01.patch, HIVE-12995.02.patch, > HIVE-12995.02.patch, HIVE-12995.patch > > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186344#comment-15186344 ] Gopal V commented on HIVE-12995: Minor nit on the OrcBatchKey::equals() - fix before commit, please. > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-12995.01.patch, HIVE-12995.patch > > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184257#comment-15184257 ] Gopal V commented on HIVE-12995: LGTM - +1, pending minor comments left on RB (s/fileId/fileKey/ pretty much) > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-12995.01.patch, HIVE-12995.patch > > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184024#comment-15184024 ] Sergey Shelukhin commented on HIVE-12995: - Test failures are unrelated. > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-12995.01.patch, HIVE-12995.patch > > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15181931#comment-15181931 ] Hive QA commented on HIVE-12995: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12791275/HIVE-12995.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9781 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.llap.cache.TestIncrementalObjectSizeEstimator.testMetadata org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7172/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7172/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7172/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12791275 - PreCommit-HIVE-TRUNK-Build > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-12995.01.patch, HIVE-12995.patch > > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177704#comment-15177704 ] Hive QA commented on HIVE-12995: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12790833/HIVE-12995.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 9736 tests executed *Failed tests:* {noformat} TestCliDriver-index_compact_2.q-vector_grouping_sets.q-lateral_view_cp.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-llap_acid.q-binarysortable_1.q-orc_merge5.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_udf_percentile org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_reorder4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_llap_uncompressed org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_llap org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_null_check org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_folder_constants org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.llap.cache.TestIncrementalObjectSizeEstimator.testMetadata org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7151/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7151/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7151/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 21 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12790833 - PreCommit-HIVE-TRUNK-Build > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-12995.patch > > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173098#comment-15173098 ] Sergey Shelukhin commented on HIVE-12995: - Ok, now I'm working on this btw :) > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15140149#comment-15140149 ] Sergey Shelukhin commented on HIVE-12995: - Not working on it yet, just assigned so it would show up in my filters... > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131710#comment-15131710 ] Gopal V commented on HIVE-12995: Yeah, we might have to reconsider what happens to inodes once HDFS federation enters the picture. The viewFS model is already somewhat threatened by this. > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131320#comment-15131320 ] Sergey Shelukhin commented on HIVE-12995: - It might be easier to make cache support T as key, then use Long for HDFS and a struct for other FSes. It might ever work with multiple FSes per cache easily, will definitely work with one. > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)