> On Aug. 13, 2014, 11:44 p.m., Lewis McGibbney wrote: > > Hey Michael, > > Can you please talk a bit about how streaming works for the FileMgr? > > I am really interested about that.
The filemanager has two primary functions: catalog metadata, and hold reference to a file. To extrapolate this to streams I needed to capture: stream metadata, and a stream handle (product name). Therefore, I created a new product structure that contains no references, and does not transfer data. Therefore the metadata can be cataloged, and a stream handle (name) can be stored as the product name, thus achieving both goals. Now the filemgr has three structures (FLAT -- file, HIERARCHICAL -- directory of files, STREAM -- no files, just metadata). The actual data of the stream is captured in Kafka where the stream name is called a "topic". Data can be captured by Kafka using standard Kafka data-flows or using a new daemon that will stream into Kafka. If a standard data-flow is needed then the metadata is stored in the filemanager using a single interaction and if the streaming daemon is used, the metadata store still uses a single interaction. Therefore, by separating the streaming of the actual data into a new daemon, the burden on the filemanager is reduced from continuous interactions streaming in chunks of data to a single interaction per stream. This is the mechanisms used to allow catalog, query and management of streams, while separating out the handling of the stream data to maintain filemanager efficiency. Any questions/comments/suggestions? - Michael ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22791/#review50528 ----------------------------------------------------------- On Aug. 13, 2014, 10:56 p.m., Michael Starch wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/22791/ > ----------------------------------------------------------- > > (Updated Aug. 13, 2014, 10:56 p.m.) > > > Review request for oodt. > > > Repository: oodt > > > Description > ------- > > This patch contains all the changes needed to add in "streaming oodt" into > the oodt svn repository. > > There are four main portions: > -Mesos Framework for Resource Manager (Prototype working) > -Spark Runner for Workflow Manager (Prototype working) > -Filemanager "streaming" type (In development) > -Deployment and cluster management scripts (In development) > > Where can this stuff be put so that it is available to use, even while it is > in development? > > > Diffs > ----- > > > http://svn.apache.org/repos/asf/oodt/trunk/cluster-tools/scripts/shutdown.sh > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/cluster-tools/scripts/start-up.sh > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/cluster-tools/scripts/start-up/mesos-master.bash > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/cluster-tools/scripts/start-up/mesos-slave.bash > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/cluster-tools/scripts/start-up/resource.bash > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/cluster-tools/scripts/utilites.sh > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/cluster-tools/setup/env-vars.sh.tmpl > PRE-CREATION > http://svn.apache.org/repos/asf/oodt/trunk/cluster-tools/setup/hosts > PRE-CREATION > http://svn.apache.org/repos/asf/oodt/trunk/cluster-tools/setup/install.sh > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/cluster-tools/setup/required-software.txt > PRE-CREATION > http://svn.apache.org/repos/asf/oodt/trunk/core/pom.xml 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/filemgr/src/main/java/org/apache/oodt/cas/filemgr/cli/action/IngestProductCliAction.java > 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/filemgr/src/main/java/org/apache/oodt/cas/filemgr/datatransfer/LocalDataTransferer.java > 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/filemgr/src/main/java/org/apache/oodt/cas/filemgr/metadata/extractors/CoreMetExtractor.java > 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/filemgr/src/main/java/org/apache/oodt/cas/filemgr/metadata/extractors/examples/MimeTypeExtractor.java > 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/filemgr/src/main/java/org/apache/oodt/cas/filemgr/structs/Product.java > 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/filemgr/src/main/java/org/apache/oodt/cas/filemgr/structs/Reference.java > 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/filemgr/src/main/java/org/apache/oodt/cas/filemgr/system/XmlRpcFileManager.java > 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/filemgr/src/main/java/org/apache/oodt/cas/filemgr/versioning/BasicVersioner.java > 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/filemgr/src/main/java/org/apache/oodt/cas/filemgr/versioning/DateTimeVersioner.java > 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/filemgr/src/main/java/org/apache/oodt/cas/filemgr/versioning/SingleFileBasicVersioner.java > 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/filemgr/src/main/java/org/apache/oodt/cas/filemgr/versioning/VersioningUtils.java > 1617800 > http://svn.apache.org/repos/asf/oodt/trunk/resource/pom.xml 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/java/org/apache/oodt/cas/resource/batchmgr/MesosBatchManager.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/java/org/apache/oodt/cas/resource/batchmgr/MesosBatchManagerFactory.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/java/org/apache/oodt/cas/resource/mesos/MesosUtilities.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/java/org/apache/oodt/cas/resource/mesos/OODTExecutor.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/java/org/apache/oodt/cas/resource/mesos/ResourceMesosFrameworkFactory.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/java/org/apache/oodt/cas/resource/mesos/ResourceMesosScheduler.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/java/org/apache/oodt/cas/resource/mesos/exception/MesosFrameworkException.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/java/org/apache/oodt/cas/resource/mesos/proto/ResourceProto.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/java/org/apache/oodt/cas/resource/monitor/MesosMonitor.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/java/org/apache/oodt/cas/resource/monitor/MesosMonitorFactory.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/java/org/apache/oodt/cas/resource/scheduler/Scheduler.java > 1617800 > > http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/proto/resc.proto > PRE-CREATION > http://svn.apache.org/repos/asf/oodt/trunk/streamer/pom.xml PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/assembly/assembly.xml > PRE-CREATION > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/bin/streamer > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/java/org/apache/oodt/cas/streamer/publisher/KafkaPublisher.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/java/org/apache/oodt/cas/streamer/publisher/Publisher.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/java/org/apache/oodt/cas/streamer/reader/InputStreamReader.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/java/org/apache/oodt/cas/streamer/reader/Reader.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/java/org/apache/oodt/cas/streamer/reader/StreamEmptyException.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/java/org/apache/oodt/cas/streamer/streams/MultiFileSequentialInputStream.java.bak > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/java/org/apache/oodt/cas/streamer/streams/MultiFileSequentialInputStreamArcheaic.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/java/org/apache/oodt/cas/streamer/system/MultiSourceStreamer.java > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/resources/cmd-line-actions.xml > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/resources/cmd-line-options.xml > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/resources/logging.properties > PRE-CREATION > > http://svn.apache.org/repos/asf/oodt/trunk/streamer/src/main/resources/streamer.properties > PRE-CREATION > > Diff: https://reviews.apache.org/r/22791/diff/ > > > Testing > ------- > > Basic functionality tests done for both the resource-manger and workflow > manager pieces. Filemanager have been tested to properly ingest a > "GenericStream" type with the lucene catalog only. > > > Thanks, > > Michael Starch > >