Convert DevGuide and FileLocking docs to Markdown
Project: http://git-wip-us.apache.org/repos/asf/lucy/repo Commit: http://git-wip-us.apache.org/repos/asf/lucy/commit/c2363da1 Tree: http://git-wip-us.apache.org/repos/asf/lucy/tree/c2363da1 Diff: http://git-wip-us.apache.org/repos/asf/lucy/diff/c2363da1 Branch: refs/heads/master Commit: c2363da1b9caa85b31bd599419429c9b7ca62e76 Parents: a1d2e1c Author: Nick Wellnhofer <wellnho...@aevum.de> Authored: Mon Jul 6 16:39:58 2015 +0200 Committer: Nick Wellnhofer <wellnho...@aevum.de> Committed: Sat Jul 11 15:03:10 2015 +0200 ---------------------------------------------------------------------- core/Lucy/Docs/DevGuide.cfh | 59 ------------------- core/Lucy/Docs/DevGuide.md | 37 ++++++++++++ core/Lucy/Docs/FileLocking.cfh | 83 --------------------------- core/Lucy/Docs/FileLocking.md | 80 ++++++++++++++++++++++++++ perl/buildlib/Lucy/Build/Binding/Docs.pm | 69 ---------------------- perl/lib/Lucy/Docs/DevGuide.pm | 24 -------- perl/lib/Lucy/Docs/FileLocking.pm | 24 -------- 7 files changed, 117 insertions(+), 259 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/lucy/blob/c2363da1/core/Lucy/Docs/DevGuide.cfh ---------------------------------------------------------------------- diff --git a/core/Lucy/Docs/DevGuide.cfh b/core/Lucy/Docs/DevGuide.cfh deleted file mode 100644 index dfab5ff..0000000 --- a/core/Lucy/Docs/DevGuide.cfh +++ /dev/null @@ -1,59 +0,0 @@ -/* Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -parcel Lucy; - -/** Quick-start guide to hacking on Apache Lucy. - * - * The Apache Lucy code base is organized into roughly four layers: - * - * * Charmonizer - compiler and OS configuration probing. - * * Clownfish - header files. - * * C - implementation files. - * * Host - binding language. - * - * Charmonizer is a configuration prober which writes a single header file, - * "charmony.h", describing the build environment and facilitating - * cross-platform development. It's similar to Autoconf or Metaconfig, but - * written in pure C. - * - * The ".cfh" files within the Lucy core are Clownfish header files. - * Clownfish is a purpose-built, declaration-only language which superimposes - * a single-inheritance object model on top of C which is specifically - * designed to co-exist happily with variety of "host" languages and to allow - * limited run-time dynamic subclassing. For more information see the - * Clownfish docs, but if there's one thing you should know about Clownfish OO - * before you start hacking, it's that method calls are differentiated from - * functions by capitalization: - * - * Indexer_Add_Doc <-- Method, typically uses dynamic dispatch. - * Indexer_add_doc <-- Function, always a direct invocation. - * - * The C files within the Lucy core are where most of Lucy's low-level - * functionality lies. They implement the interface defined by the Clownfish - * header files. - * - * The C core is intentionally left incomplete, however; to be usable, it must - * be bound to a "host" language. (In this context, even C is considered a - * "host" which must implement the missing pieces and be "bound" to the core.) - * Some of the binding code is autogenerated by Clownfish on a spec customized - * for each language. Other pieces are hand-coded in either C (using the - * host's C API) or the host language itself. - */ - -inert class Lucy::Docs::DevGuide { } - - http://git-wip-us.apache.org/repos/asf/lucy/blob/c2363da1/core/Lucy/Docs/DevGuide.md ---------------------------------------------------------------------- diff --git a/core/Lucy/Docs/DevGuide.md b/core/Lucy/Docs/DevGuide.md new file mode 100644 index 0000000..a1b0a8b --- /dev/null +++ b/core/Lucy/Docs/DevGuide.md @@ -0,0 +1,37 @@ +# Quick-start guide to hacking on Apache Lucy. + +The Apache Lucy code base is organized into roughly four layers: + +* Charmonizer - compiler and OS configuration probing. +* Clownfish - header files. +* C - implementation files. +* Host - binding language. + +Charmonizer is a configuration prober which writes a single header file, +"charmony.h", describing the build environment and facilitating +cross-platform development. It's similar to Autoconf or Metaconfig, but +written in pure C. + +The ".cfh" files within the Lucy core are Clownfish header files. +Clownfish is a purpose-built, declaration-only language which superimposes +a single-inheritance object model on top of C which is specifically +designed to co-exist happily with variety of "host" languages and to allow +limited run-time dynamic subclassing. For more information see the +Clownfish docs, but if there's one thing you should know about Clownfish OO +before you start hacking, it's that method calls are differentiated from +functions by capitalization: + + Indexer_Add_Doc <-- Method, typically uses dynamic dispatch. + Indexer_add_doc <-- Function, always a direct invocation. + +The C files within the Lucy core are where most of Lucy's low-level +functionality lies. They implement the interface defined by the Clownfish +header files. + +The C core is intentionally left incomplete, however; to be usable, it must +be bound to a "host" language. (In this context, even C is considered a +"host" which must implement the missing pieces and be "bound" to the core.) +Some of the binding code is autogenerated by Clownfish on a spec customized +for each language. Other pieces are hand-coded in either C (using the +host's C API) or the host language itself. + http://git-wip-us.apache.org/repos/asf/lucy/blob/c2363da1/core/Lucy/Docs/FileLocking.cfh ---------------------------------------------------------------------- diff --git a/core/Lucy/Docs/FileLocking.cfh b/core/Lucy/Docs/FileLocking.cfh deleted file mode 100644 index 7e17bd4..0000000 --- a/core/Lucy/Docs/FileLocking.cfh +++ /dev/null @@ -1,83 +0,0 @@ -/* Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -parcel Lucy; - -/** Manage indexes on shared volumes. - * - * Normally, index locking is an invisible process. Exclusive write access is - * controlled via lockfiles within the index directory and problems only arise - * if multiple processes attempt to acquire the write lock simultaneously; - * search-time processes do not ordinarily require locking at all. - * - * On shared volumes, however, the default locking mechanism fails, and manual - * intervention becomes necessary. - * - * Both read and write applications accessing an index on a shared volume need - * to identify themselves with a unique `host` id, e.g. hostname or - * ip address. Knowing the host id makes it possible to tell which lockfiles - * belong to other machines and therefore must not be removed when the - * lockfile's pid number appears not to correspond to an active process. - * - * At index-time, the danger is that multiple indexing processes from - * different machines which fail to specify a unique `host` id can - * delete each others' lockfiles and then attempt to modify the index at the - * same time, causing index corruption. The search-time problem is more - * complex. - * - * Once an index file is no longer listed in the most recent snapshot, Indexer - * attempts to delete it as part of a post-[](cfish:Indexer.Commit) cleanup routine. It is - * possible that at the moment an Indexer is deleting files which it believes - * no longer needed, a Searcher referencing an earlier snapshot is in fact - * using them. The more often that an index is either updated or searched, - * the more likely it is that this conflict will arise from time to time. - * - * Ordinarily, the deletion attempts are not a problem. On a typical unix - * volume, the files will be deleted in name only: any process which holds an - * open filehandle against a given file will continue to have access, and the - * file won't actually get vaporized until the last filehandle is cleared. - * Thanks to "delete on last close semantics", an Indexer can't truly delete - * the file out from underneath an active Searcher. On Windows, where file - * deletion fails whenever any process holds an open handle, the situation is - * different but still workable: Indexer just keeps retrying after each commit - * until deletion finally succeeds. - * - * On NFS, however, the system breaks, because NFS allows files to be deleted - * out from underneath active processes. Should this happen, the unlucky read - * process will crash with a "Stale NFS filehandle" exception. - * - * Under normal circumstances, it is neither necessary nor desirable for - * IndexReaders to secure read locks against an index, but for NFS we have to - * make an exception. LockFactory's [](cfish:LockFactory.Make_Shared_Lock) method exists for this - * reason; supplying an IndexManager instance to IndexReader's constructor - * activates an internal locking mechanism using [](cfish:LockFactory.Make_Shared_Lock) which - * prevents concurrent indexing processes from deleting files that are needed - * by active readers. - * - * Since shared locks are implemented using lockfiles located in the index - * directory (as are exclusive locks), reader applications must have write - * access for read locking to work. Stale lock files from crashed processes - * are ordinarily cleared away the next time the same machine -- as identified - * by the `host` parameter -- opens another IndexReader. (The - * classic technique of timing out lock files is not feasible because search - * processes may lie dormant indefinitely.) However, please be aware that if - * the last thing a given machine does is crash, lock files belonging to it - * may persist, preventing deletion of obsolete index data. - */ - -inert class Lucy::Docs::FileLocking { } - - http://git-wip-us.apache.org/repos/asf/lucy/blob/c2363da1/core/Lucy/Docs/FileLocking.md ---------------------------------------------------------------------- diff --git a/core/Lucy/Docs/FileLocking.md b/core/Lucy/Docs/FileLocking.md new file mode 100644 index 0000000..b28eb72 --- /dev/null +++ b/core/Lucy/Docs/FileLocking.md @@ -0,0 +1,80 @@ +# Manage indexes on shared volumes. + +Normally, index locking is an invisible process. Exclusive write access is +controlled via lockfiles within the index directory and problems only arise +if multiple processes attempt to acquire the write lock simultaneously; +search-time processes do not ordinarily require locking at all. + +On shared volumes, however, the default locking mechanism fails, and manual +intervention becomes necessary. + +Both read and write applications accessing an index on a shared volume need +to identify themselves with a unique `host` id, e.g. hostname or +ip address. Knowing the host id makes it possible to tell which lockfiles +belong to other machines and therefore must not be removed when the +lockfile's pid number appears not to correspond to an active process. + +At index-time, the danger is that multiple indexing processes from +different machines which fail to specify a unique `host` id can +delete each others' lockfiles and then attempt to modify the index at the +same time, causing index corruption. The search-time problem is more +complex. + +Once an index file is no longer listed in the most recent snapshot, Indexer +attempts to delete it as part of a post-[](lucy:Indexer.Commit) cleanup routine. It is +possible that at the moment an Indexer is deleting files which it believes +no longer needed, a Searcher referencing an earlier snapshot is in fact +using them. The more often that an index is either updated or searched, +the more likely it is that this conflict will arise from time to time. + +Ordinarily, the deletion attempts are not a problem. On a typical unix +volume, the files will be deleted in name only: any process which holds an +open filehandle against a given file will continue to have access, and the +file won't actually get vaporized until the last filehandle is cleared. +Thanks to "delete on last close semantics", an Indexer can't truly delete +the file out from underneath an active Searcher. On Windows, where file +deletion fails whenever any process holds an open handle, the situation is +different but still workable: Indexer just keeps retrying after each commit +until deletion finally succeeds. + +On NFS, however, the system breaks, because NFS allows files to be deleted +out from underneath active processes. Should this happen, the unlucky read +process will crash with a "Stale NFS filehandle" exception. + +Under normal circumstances, it is neither necessary nor desirable for +IndexReaders to secure read locks against an index, but for NFS we have to +make an exception. LockFactory's [](lucy:LockFactory.Make_Shared_Lock) method exists for this +reason; supplying an IndexManager instance to IndexReader's constructor +activates an internal locking mechanism using [](lucy:LockFactory.Make_Shared_Lock) which +prevents concurrent indexing processes from deleting files that are needed +by active readers. + +~~~ perl +use Sys::Hostname qw( hostname ); +my $hostname = hostname() or die "Can't get unique hostname"; +my $manager = Lucy::Index::IndexManager->new( host => $hostname ); + +# Index time: +my $indexer = Lucy::Index::Indexer->new( + index => '/path/to/index', + manager => $manager, +); + +# Search time: +my $reader = Lucy::Index::IndexReader->open( + index => '/path/to/index', + manager => $manager, +); +my $searcher = Lucy::Search::IndexSearcher->new( index => $reader ); +~~~ + +Since shared locks are implemented using lockfiles located in the index +directory (as are exclusive locks), reader applications must have write +access for read locking to work. Stale lock files from crashed processes +are ordinarily cleared away the next time the same machine -- as identified +by the `host` parameter -- opens another IndexReader. (The +classic technique of timing out lock files is not feasible because search +processes may lie dormant indefinitely.) However, please be aware that if +the last thing a given machine does is crash, lock files belonging to it +may persist, preventing deletion of obsolete index data. + http://git-wip-us.apache.org/repos/asf/lucy/blob/c2363da1/perl/buildlib/Lucy/Build/Binding/Docs.pm ---------------------------------------------------------------------- diff --git a/perl/buildlib/Lucy/Build/Binding/Docs.pm b/perl/buildlib/Lucy/Build/Binding/Docs.pm deleted file mode 100644 index 07e5ce3..0000000 --- a/perl/buildlib/Lucy/Build/Binding/Docs.pm +++ /dev/null @@ -1,69 +0,0 @@ -# Licensed to the Apache Software Foundation (ASF) under one or more -# contributor license agreements. See the NOTICE file distributed with -# this work for additional information regarding copyright ownership. -# The ASF licenses this file to You under the Apache License, Version 2.0 -# (the "License"); you may not use this file except in compliance with -# the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -package Lucy::Build::Binding::Docs; -use strict; -use warnings; - -our $VERSION = '0.004000'; -$VERSION = eval $VERSION; - -sub bind_all { - my $class = shift; - $class->bind_devguide; - $class->bind_filelocking; -} - -sub bind_devguide { - my $pod_spec = Clownfish::CFC::Binding::Perl::Pod->new; - my $binding = Clownfish::CFC::Binding::Perl::Class->new( - parcel => "Lucy", - class_name => "Lucy::Docs::DevGuide", - ); - $binding->set_pod_spec($pod_spec); - Clownfish::CFC::Binding::Perl::Class->register($binding); -} - -sub bind_filelocking { - my $pod_spec = Clownfish::CFC::Binding::Perl::Pod->new; - my $synopsis = <<'END_SYNOPSIS'; - use Sys::Hostname qw( hostname ); - my $hostname = hostname() or die "Can't get unique hostname"; - my $manager = Lucy::Index::IndexManager->new( host => $hostname ); - - # Index time: - my $indexer = Lucy::Index::Indexer->new( - index => '/path/to/index', - manager => $manager, - ); - - # Search time: - my $reader = Lucy::Index::IndexReader->open( - index => '/path/to/index', - manager => $manager, - ); - my $searcher = Lucy::Search::IndexSearcher->new( index => $reader ); -END_SYNOPSIS - $pod_spec->set_synopsis($synopsis); - - my $binding = Clownfish::CFC::Binding::Perl::Class->new( - parcel => "Lucy", - class_name => "Lucy::Docs::FileLocking", - ); - $binding->set_pod_spec($pod_spec); - - Clownfish::CFC::Binding::Perl::Class->register($binding); -} - -1; http://git-wip-us.apache.org/repos/asf/lucy/blob/c2363da1/perl/lib/Lucy/Docs/DevGuide.pm ---------------------------------------------------------------------- diff --git a/perl/lib/Lucy/Docs/DevGuide.pm b/perl/lib/Lucy/Docs/DevGuide.pm deleted file mode 100644 index 4de9a03..0000000 --- a/perl/lib/Lucy/Docs/DevGuide.pm +++ /dev/null @@ -1,24 +0,0 @@ -# Licensed to the Apache Software Foundation (ASF) under one or more -# contributor license agreements. See the NOTICE file distributed with -# this work for additional information regarding copyright ownership. -# The ASF licenses this file to You under the Apache License, Version 2.0 -# (the "License"); you may not use this file except in compliance with -# the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -use Lucy; -our $VERSION = '0.004000'; -$VERSION = eval $VERSION; - -1; - -__END__ - - http://git-wip-us.apache.org/repos/asf/lucy/blob/c2363da1/perl/lib/Lucy/Docs/FileLocking.pm ---------------------------------------------------------------------- diff --git a/perl/lib/Lucy/Docs/FileLocking.pm b/perl/lib/Lucy/Docs/FileLocking.pm deleted file mode 100644 index 4de9a03..0000000 --- a/perl/lib/Lucy/Docs/FileLocking.pm +++ /dev/null @@ -1,24 +0,0 @@ -# Licensed to the Apache Software Foundation (ASF) under one or more -# contributor license agreements. See the NOTICE file distributed with -# this work for additional information regarding copyright ownership. -# The ASF licenses this file to You under the Apache License, Version 2.0 -# (the "License"); you may not use this file except in compliance with -# the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -use Lucy; -our $VERSION = '0.004000'; -$VERSION = eval $VERSION; - -1; - -__END__ - -