[ 
https://issues.apache.org/jira/browse/HIVE-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125502#comment-13125502
 ] 

John Sichi commented on HIVE-1496:
----------------------------------

Discussion from IRC:

{noformat}
ssalbiz: jsichi: I was looking at ashutosh 's patch for 1496, and I was 
wondering if the problem with it the lack of atomicity? It seems to chain the 
map-red tasks to populate the index with the DDLTask correctly if I'm reading 
the patch/testcase right
[4:37pm] jsichi: ssalbiz:  you're right; I misread the patch--didn't notice the 
addIdxTasks part.  ashutosh, sorry about that.
[4:40pm] jsichi: but I don't think we should be calling db.createIndex directly 
from DDLSemanticAnalyzer...should still be chaining in the DDLWork for that
[4:41pm] ssalbiz: right, I agree
[4:43pm] ashutosh: jsichi: I rememeber you commenting on the jira that 
atomicity will be an issue, but its ok to solve it seperately in a followup work
[4:44pm] jsichi: ashutosh:  agreed.  But we should still be following the usual 
pattern for executing the metastore update from within a task (rather than 
analyzer)
[4:45pm] jsichi: Another followup is to support a mode whereby updates to a 
table trigger a rebuild of the corresponding index partitions.
[4:47pm] jsichi: I guess the reason you had to do it early (in the analyzer) is 
that the build-task generation requires the metastore to already be populated.
[4:47pm] ashutosh: yeah.. correct thats the reason
[4:47pm] jsichi: hmmm
[4:48pm] ashutosh: build-task assumes all the data to be populated
[4:55pm] jsichi: I guess the only way to resolve that would be to factor out 
the code that knows how to make up an Index object from a CreateIndexDesc, and 
then create a temp during analysis (then discard it), then later create the 
real one when the task executes.
[4:56pm] ashutosh: yeah.. i think temp object approach may work
[4:57pm] ashutosh: but probably will churn around lot of code
[4:58pm] jsichi: having EXPLAIN able to show what's gonna be done without 
actually doing it seems like a valuable guarantee to preserve
[4:58pm] jsichi: (see e.g. HIVE-2478)
[5:01pm] ashutosh: ⁃actually temp obj approach may not work because createIndex 
task connects to metastore to get the metadata … so it must exist in metastore
[5:02pm] jsichi: that's only for verifying that the index name does not 
conflict, rigth?
[5:05pm] jsichi: woohoo, finally gonna get a clean trunk build on Jenkins since 
I'm about to commit HIVE-2493!
[5:05pm] ashutosh: awesome
[5:05pm] jsichi: already got a clean run on 0.8
[5:07pm] ashutosh: for me HBase tests always fail
[5:07pm] ashutosh: with exception NoRegionServerFound exception
[5:07pm] ashutosh: whats the magic there ?
[5:08pm] ssalbiz: looking at the TableBasedIndexHandler code, it does a bunch 
of checks to ensure that the partition specs of the table and index are in sync 
in the metastore. I think it would be possible to write a helper method in 
TableBasedIndexHandler that can be used to generate Index Map-Red Tasks without 
relying on the metastore at all if we can assume that the Index ms partition 
spec and the base table partition spec are going to be consistent when the da
[5:09pm] ssalbiz: seems like less code churn than trying to feed the current 
method mock metastore/Index objects to make those checks pass
[5:10pm] jsichi: ashutosh:  hmm, dunno...I'll bet there's a real exception 
buried somewhere deep in the logs...
[5:10pm] jsichi: ssalbiz:  I don't think we want to change the index handler 
interface though
[5:15pm] ssalbiz: hmmm, ok, in that case I guess we will have to feed temp 
objects to the existing generateIndexBuildTaskList method
[5:18pm] ashutosh: one possibility to avoid explain problem is to not execute 
ms operation in semantic analyzer if its an explain query
[5:19pm] jsichi: There's already precedent for the temp objects....e.g. 
Hive.createIndex already calls indexHandler.analyzeIndexDefinition with 
indexDesc and tt params which haven't actually been written to the metastore 
yet.
[5:20pm] jsichi: ashutosh:  that wouldn't work, would it, since the explain is 
supposed to show the index build tasks too
[5:22pm] ashutosh: hmm.. right.. it wont show it.. but will atleast prevent 
execution of unwanted ms operation in case of explain ...
[5:22pm] ashutosh: i can take a look at temp object approach
[5:23pm] jsichi: ssalbiz is actually going to work on indexing again as part of 
a school project, so if you're OK with it, we can assign back to him.
[5:25pm] ashutosh: ya.. thats fine..
[5:25pm] ashutosh: he can take it up
[5:25pm] jsichi: OK cool.  I'll copy-paste this conversation into JIRA in case 
we forget anything later.
{noformat}

                
> enhance CREATE INDEX to support immediate index build
> -----------------------------------------------------
>
>                 Key: HIVE-1496
>                 URL: https://issues.apache.org/jira/browse/HIVE-1496
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.7.0, 0.8.0
>            Reporter: John Sichi
>            Assignee: Syed S. Albiz
>         Attachments: hive-1496.patch
>
>
> Currently we only support WITH DEFERRED REBUILD.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to