[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-11675: -- Labels: TODOC2.1 (was: ) > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Labels: TODOC2.1 > Fix For: 2.1.0 > > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, > HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.11.patch, > HIVE-11675.12.patch, HIVE-11675.13.patch, HIVE-11675.14.patch, > HIVE-11675.patch, HIVE-11675.premature.opti.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Resolution: Fixed Fix Version/s: 2.1.0 Status: Resolved (was: Patch Available) Committed to master after resolving conflicts. > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.1.0 > > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, > HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.11.patch, > HIVE-11675.12.patch, HIVE-11675.13.patch, HIVE-11675.14.patch, > HIVE-11675.patch, HIVE-11675.premature.opti.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.13.patch Another bugfix. > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, > HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.11.patch, > HIVE-11675.12.patch, HIVE-11675.13.patch, HIVE-11675.patch, > HIVE-11675.premature.opti.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.12.patch Fixes a bug where results are incorrect when not returning a footer > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, > HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.11.patch, > HIVE-11675.12.patch, HIVE-11675.patch, HIVE-11675.premature.opti.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: (was: HIVE-11675.11.patch) > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, > HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.11.patch, > HIVE-11675.patch, HIVE-11675.premature.opti.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.11.patch > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, > HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.11.patch, > HIVE-11675.patch, HIVE-11675.premature.opti.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.11.patch > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, > HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.11.patch, > HIVE-11675.patch, HIVE-11675.premature.opti.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.premature.opti.patch Removing some premature optimization and attaching the patch to re-add it if desired. > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, > HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.patch, > HIVE-11675.premature.opti.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.10.patch This patch needs a small update, I'll take a queue spot for now ;) > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, > HIVE-11675.09.patch, HIVE-11675.10.patch, HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.08.patch > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.08.patch, > HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.07.patch Addressed RB feedback. Also refactored all the cache classes out of OrcInputFormat to avoid semanticanalyzerization ;) > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.07.patch, HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.06.patch > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.06.patch, HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.05.patch Tests, with some refactoring to make them possible. corrupt IDs should also be restored in PPD case, I guess; I will file a separate JIRA. Might be better to queue the updates right inside metastore; unlike with plain get, metastore detects if they are corrupted. > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.05.patch, > HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.04.patch Rebasing the patch > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.04.patch, HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.03.patch Addressed the feedback; I'll see if the test can be added. > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.03.patch, HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.02.patch Rebased the patch again. > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Description: Need to take a look at the best flow. It won't be much different if we do filtering metastore call for each partition. So perhaps we'd need the custom sync point/batching after all. Or we can make it opportunistic and not fetch any footers unless it can be pushed down to metastore or fetched from local cache, that way the only slow threaded op is directory listings was: Need to take a look at the best flow. It won't be much different if we do filtering metastore call for each partition. So perhaps we'd need the custom sync point/batching after all. Or we can make it opportunistic and not fetch any footers unless it can be pushed down to metastore or fetched from local cache, that way the only slow threaded op is directory listings NO PRECOMMIT TESTS > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: (was: HIVE-11675.01.patch) > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.01.patch > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.01.patch, > HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.01.patch massively rebased patch... > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Description: Need to take a look at the best flow. It won't be much different if we do filtering metastore call for each partition. So perhaps we'd need the custom sync point/batching after all. Or we can make it opportunistic and not fetch any footers unless it can be pushed down to metastore or fetched from local cache, that way the only slow threaded op is directory listings NO PRECOMMIT TESTS was: Need to take a look at the best flow. It won't be much different if we do filtering metastore call for each partition. So perhaps we'd need the custom sync point/batching after all. Or we can make it opportunistic and not fetch any footers unless it can be pushed down to metastore or fetched from local cache, that way the only slow threaded op is directory listings > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11675: Attachment: HIVE-11675.patch > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)