Hello, "I think the simplest thing that should be done first is to provide option to skip the check" I agree that whatever we do, we should not introduce any change in user experience by default. But since the default's behaviour is to not set any TTL in the meta-data, I have conflicted feelings about that. But in our usecase, we will mostly try to optimize a single application, so we could add this session option at querying time as well. Long story short: I do not have strong opinions about what the default should be.
Concerning the overall change: the introduction of TTL, can we submit a design document, or would you prefer to invest on the longer term meta data repository? Regards, Joel On Thu, Aug 2, 2018 at 6:28 AM, Padma Penumarthy <penumarthy.pa...@gmail.com > wrote: > I think the simplest thing that should be done first is to provide option > to skip the check. > The default behavior for that option will be what we do today i.e. check > root directory > and all sub directories underneath. > > Thanks > Padma > > > > On Mon, Jul 30, 2018 at 3:01 AM, Joel Pfaff <joel.pf...@gmail.com> wrote: > > > Hello, > > > > Thanks a lot for all these feedbacks, trying to respond to everything > > below: > > > > @Parth: > > "I don't think we would want to maintain a TTL for the metadata store so > > introducing one now would mean that we might break backward compatibility > > down the road." > > Yes, I am aware of this activity starting, and I agree that whatever the > > solution decided later on for the new metadata store, it most probably > > won't support a concept of TTL. This means that we would either have to > > break the support of the `WITH TTL` extension of the SQL command, or to > > ignore it down the road. None of these solutions seem particularly > > appealing to me. > > > > @Padma: > > "What issues we need to worry about if different directories in the > > hierarchy are checked last at different times ?" > > Knowing that the refresh is always recursive, we can only have two cases: > > the parent level cache is refreshed at the same time as the child level > > cache, or the parent level cache is older than the child level cache > > (because a query has run in a sub-directory, that triggered a refresh of > > the metadata in this sub-directory). In both cases, checking the > timestamp > > of the cache file at the root directory of the query is enough to know if > > the TTL criteria is respected. > > In the case the cache files are not refreshed at the same time between > > parent and children directories, and that the parent's cache is still > valid > > with regards to its TTL, Drill would trust the parent cache, and issue an > > execution plan with this set of files. The same query on a child folder > > would use the children cache, that would have been refreshed more > recently, > > and this would potentially result in issuing an execution plan with > another > > set of files. > > So this basically, this TTL feature could create discrepancies in the > > results, and these discrepancies could last up to the TTL value. > > > > "Do we have to worry about coordinating against multiple drillbits ?" > > That would be better indeed, as the problem already exists today (I have > > not found any locking mechanism on metadata file), I am not sure this > > change would make it worse. > > So the reply is yes, we should worry, but I think the fix for that would > be > > independent to this change. > > > > "Another option is if the difference between modification time of > directory > > and metadata cache file is within TTL limit, do not do anything. If we do > > that, do we get errors during execution (I think so) ?" > > We would get errors if there would be files removed between the time of > the > > last generation of meta-data, and the time of the execution. As in the > case > > above, this can already happen, since there is currently no guarantee > that > > the files at planning time will still be there at execution time. The > > timeframe would increase from a few milliseconds to several minutes, so > the > > frequency of this kind of problem occurring would be much higher. > > I would recommend to quietly ignore missing files by considering them as > > empty files. > > > > "Also, how to reset that state and do metadata cache refresh eventually > ?" > > We could reuse the REFRESH TABLE METADATA command to force the refresh. > > This would allow for collaborative ingestion jobs to force the refresh > when > > the datasets have changed. > > Non-collaborative jobs would then rely on the TTL to get the new dataset > > available. > > > > "Instead of TTL, I think having a system/session option that will let us > > skip this check altogether would be a good thing to have. So, if we know > we > > are not adding new data, we can set that option." > > I would see the need to set TTL per Table. Since different tables will > have > > different update frequencies. > > I agree on a session option to bypass TTL check, so that this user will > > always see the last dataset. > > The question then becomes: what would be the default value for this > option? > > > > Regards, Joel > > > > > > On Fri, Jul 13, 2018 at 9:06 AM, Padma Penumarthy < > > penumarthy.pa...@gmail.com> wrote: > > > > > Hi Joel, > > > > > > This is my understanding: > > > We have list of all directories (i.e. all subdirectories and their > > > subdirectories etc.) in the metadata > > > cache file of each directory. We go through that list of directories > and > > > check > > > directory modification time against modification time of metadata cache > > > file in that directory. > > > If this does not match for any of the directories, we build the > metadata > > > cache for the whole hierarchy. > > > The reason we have to do this adding new files will only update > > > modification time of immediate > > > parent directory and not the whole hierarchy. > > > > > > Regarding your proposal, some random thoughts: > > > How will you get current time that can be compared against last > > > modification time set by the file system ? > > > I think you meant compare current system time of the running java > process > > > i.e. drillbit > > > against last time we checked if metadata cache needs to be updated for > > that > > > directory. > > > What issues we need to worry about if different directories in the > > > hierarchy are checked last at different times ? > > > Do we have to worry about coordinating against multiple drillbits ? > > > > > > Another option is if the difference between modification time of > > directory > > > and metadata cache file is within > > > TTL limit, do not do anything. If we do that, do we get errors during > > > execution (I think so) ? > > > Also, how to reset that state and do metadata cache refresh eventually > ? > > > We are not saving time for modification time checks here. > > > > > > Instead of TTL, I think having a system/session option that will let us > > > skip this check altogether would be a > > > good thing to have. So, if we know we are not adding new data, we can > set > > > that option. > > > > > > Instead of saving this TTL in metadata cache file for each > > > table(directory), > > > is it better to have this TTL as global system or session option ? > > > In that case, we cannot have a different TTL for each table, but it > makes > > > it much simpler. > > > Otherwise, there are some complications to think about. > > > We have a root metadata file per directory with each of the > > subdirectories > > > underneath having their own metadata file. > > > So, if we update the TTL of the root directory, do we update for all > the > > > subdirectories or just the top level directory ? > > > What issues we need to think about if TTL of the root directory and > > > subdirectories are different ? > > > > > > > > > Thanks > > > Padma > > > > > > > > > > > > > > > On Thu, Jul 12, 2018 at 8:07 AM, Joel Pfaff <joel.pf...@gmail.com> > > wrote: > > > > > > > Hello, > > > > > > > > Thanks for the feedback. > > > > > > > > The logic I had in mind was to add the TTL, as a refresh_interval > field > > > in > > > > the root metadata file. > > > > > > > > At each query, the current time would be compared to the addition of > > the > > > > modification time of the root metadata file and the refresh_interval. > > > > If the current time is greater, it would mean the metadata may be > > > invalid, > > > > so the regular process would apply: recursively going through the > file > > to > > > > check for updates, and trig a full metadata cache refresh any change > is > > > > detected, or just touch the metadata file to align its modification > > time > > > > with the current time if no change is detected. > > > > If the current time is smaller, the root metadata would be trusted > > > (without > > > > additional checks) and the planning would continue. > > > > > > > > So in most of the cases, only the timestamp of the root metadata file > > > would > > > > be checked. In the worst case (at most once per TTL), all the > > timestamps > > > > would be checked. > > > > > > > > Regards, Joel > > > > > > > > On Thu, Jul 12, 2018 at 4:47 PM, Vitalii Diravka < > > > > vitalii.dira...@gmail.com> > > > > wrote: > > > > > > > > > Hi Joel, > > > > > > > > > > Sounds reasonable. > > > > > But if Drill checks this TTL property from metadata cache file for > > > every > > > > > query and for every file instead of file timestamp, it will not > give > > > the > > > > > benefit. > > > > > I suppose we can add this TTL property to only root metadata cache > > file > > > > and > > > > > check it only once per query. > > > > > > > > > > Could you clarify the details, what is the TTL time? > > > > > How TTL info could be used to determine whether refresh is needed > for > > > the > > > > > query? > > > > > > > > > > Kind regards > > > > > Vitalii > > > > > > > > > > > > > > > On Thu, Jul 12, 2018 at 4:40 PM Joel Pfaff <joel.pf...@gmail.com> > > > wrote: > > > > > > > > > > > Hello, > > > > > > > > > > > > Today, on a table for which we have created statistics (through > the > > > > > REFRESH > > > > > > TABLE METADATA <path to table> command), Drill validates the > > > timestamp > > > > of > > > > > > every files or directory involved in the scan. > > > > > > > > > > > > If the timestamps of the files are greater than the one of the > > > metadata > > > > > > file, then a re-regeneration of the meta-data file is triggered. > > > > > > In the case the timestamp of the metadata file is the greatest, > > then > > > > the > > > > > > planning continues without regenerating the metadata. > > > > > > > > > > > > When the number of files to be queried increases, this operation > > can > > > > > take a > > > > > > significant amount of time. > > > > > > We have seen cases where this validation step alone is taking 3 > to > > 5 > > > > > > seconds (just checking the timestamps), meaning the planning time > > was > > > > > > taking way more time than the querying time. > > > > > > And this can be problematic in some usecases where the response > > time > > > is > > > > > > favored compared to the `accuracy` of the data. > > > > > > > > > > > > What would you think about adding an option to the metadata > > > generation, > > > > > so > > > > > > that the metadata is trusted for a configurable time period > > > > > > Example : REFRESH TABLE METADATA <path to table> WITH TTL='15m' > > > > > > The exact syntax, of course, needs to be thought through. > > > > > > > > > > > > This TTL would be stored in the metadata file, and used to > > determine > > > > if a > > > > > > refresh is needed at each query. And this would significantly > > > decrease > > > > > the > > > > > > planning time when the number of files represented in the > metadata > > > file > > > > > is > > > > > > important. > > > > > > > > > > > > Of course, this means that there could be cases where the > metadata > > > > would > > > > > be > > > > > > wrong, so cases like the one below would need to be solved (since > > > they > > > > > may > > > > > > happen much more frequently): > > > > > > https://issues.apache.org/jira/browse/DRILL-6194 > > > > > > But my feeling is that since we already do have a kind of race > > > > condition > > > > > > between the view of the file system at the planning time, and the > > > state > > > > > > that will be found during the execution, we could gracefully > accept > > > > that > > > > > > some files may have disappeared between the planning and the > > > execution. > > > > > > > > > > > > In the case the TTL would need to be changed, or be removed > > > completely, > > > > > > this could be done by re-issuing a REFRESH TABLE METADATA, either > > > with > > > > a > > > > > > new TTL, or without TTL at all. > > > > > > > > > > > > What do you think? > > > > > > > > > > > > Regards, Joel > > > > > > > > > > > > > > > > > > > > >