Author: rvs
Date: Fri Apr 20 16:25:35 2012
New Revision: 1328434
URL: http://svn.apache.org/viewvc?rev=1328434&view=rev
Log:
BIGTOP-545. package testing manifest in trunk needs to be updated
Modified:
incubator/bigtop/trunk/bigtop-tests/test-artifacts/package/src/main/resources/package_data.xml
Modified:
incubator/bigtop/trunk/bigtop-tests/test-artifacts/package/src/main/resources/package_data.xml
URL:
http://svn.apache.org/viewvc/incubator/bigtop/trunk/bigtop-tests/test-artifacts/package/src/main/resources/package_data.xml?rev=1328434&r1=1328433&r2=1328434&view=diff
==============================================================================
---
incubator/bigtop/trunk/bigtop-tests/test-artifacts/package/src/main/resources/package_data.xml
(original)
+++
incubator/bigtop/trunk/bigtop-tests/test-artifacts/package/src/main/resources/package_data.xml
Fri Apr 20 16:25:35 2012
@@ -124,7 +124,7 @@ Java Servlet and JavaServer Pages techno
<flume-node>
<metadata>
<summary>The flume node daemon is a core element of flume's data path and
is responsible for generating, processing, and delivering data.</summary>
- <description>The Flume node daemon is a core element of flume's data path
and is responsible for generating, processing, and delivering
data.</description>
+ <description>Flume is a reliable, scalable, and manageable distributed
data collection application for collecting data such as logs and delivering it
to data stores such as Hadoop's HDFS. It can efficiently collect, aggregate,
and move large amounts of log data. It has a simple, but flexible, architecture
based on streaming data flows. It is robust and fault tolerant with tunable
reliability mechanisms and many failover and recovery mechanisms. The system is
centrally managed and allows for intelligent dynamic management. It uses a
simple extensible data model that allows for online analytic
applications.</description>
<url>http://incubator.apache.org/projects/flume.html</url>
</metadata>
<deps>
@@ -141,8 +141,7 @@ Java Servlet and JavaServer Pages techno
<sqoop>
<metadata>
<summary>Sqoop allows easy imports and exports of data sets between
databases and the Hadoop Distributed File System (HDFS).</summary>
- <description>Sqoop is a tool that provides the ability to import and
export data sets between
- the Hadoop Distributed File System (HDFS) and relational
databases.</description>
+ <description>Sqoop allows easy imports and exports of data sets between
databases and the Hadoop Distributed File System (HDFS).</description>
<url>http://incubator.apache.org/sqoop/</url>
</metadata>
<deps>
@@ -183,37 +182,39 @@ Java Servlet and JavaServer Pages techno
<oozie>
<metadata>
<summary>Oozie is a system that runs workflows of Hadoop jobs.</summary>
- <description>Oozie workflows are actions arranged in a control dependency
DAG (Direct
+ <description> Oozie is a system that runs workflows of Hadoop jobs.
+ Oozie workflows are actions arranged in a control dependency DAG (Direct
Acyclic Graph).
+
Oozie coordinator functionality allows to start workflows at regular
frequencies and when data becomes available in HDFS.
- .
+
An Oozie workflow may contain the following types of actions nodes:
map-reduce, map-reduce streaming, map-reduce pipes, pig, file-system,
sub-workflows, java, hive, sqoop and ssh (deprecated).
- .
+
Flow control operations within the workflow can be done using decision,
fork and join nodes. Cycles in workflows are not supported.
- .
+
Actions and decisions can be parameterized with job properties, actions
output (i.e. Hadoop counters) and HDFS file information (file exists,
file size, etc). Formal parameters are expressed in the workflow definition
- as variables.
- .
+ as ${VAR} variables.
+
A Workflow application is an HDFS directory that contains the workflow
definition (an XML file), all the necessary files to run all the actions:
JAR files for Map/Reduce jobs, shells for streaming Map/Reduce jobs, native
libraries, Pig scripts, and other resource files.
- .
- Running workflow jobs is done via command line tools, a WebServices API or
- a Java API.
- .
+
+ Running workflow jobs is done via command line tools, a WebServices API
+ or a Java API.
+
Monitoring the system and workflow jobs can be done via a web console, the
command line tools, the WebServices API and the Java API.
- .
+
Oozie is a transactional system and it has built in automatic and manual
retry capabilities.
- .
+
In case of workflow job failure, the workflow job can be rerun skipping
previously completed actions, the workflow application can be patched before
being rerun.</description>
@@ -394,7 +395,7 @@ server.</description>
<metadata>
<summary>Provides a Hive Thrift service.</summary>
<description>This optional package hosts a Thrift server for Hive clients
across a network to use.</description>
- <url>http://hive.hadoop.apache.org/</url>
+ <url>http://hive.apache.org/</url>
</metadata>
<deps>
<hive>/self</hive>
@@ -415,9 +416,16 @@ server.</description>
<hbase>
<metadata>
<summary>HBase is the Hadoop database. Use it when you need random,
realtime read/write access to your Big Data. This project's goal is the hosting
of very large tables -- billions of rows X millions of columns -- atop clusters
of commodity hardware.</summary>
- <description>Use it when you need random, realtime read/write access to
your Big Data.
- This project's goal is the hosting of very large tables -- billions of rows
- X millions of columns -- atop clusters of commodity hardware.</description>
+ <description>HBase is an open-source, distributed, column-oriented store
modeled after Google' Bigtable: A Distributed Storage System for Structured
Data by Chang et al. Just as Bigtable leverages the distributed data storage
provided by the Google File System, HBase provides Bigtable-like capabilities
on top of Hadoop. HBase includes:
+
+ * Convenient base classes for backing Hadoop MapReduce jobs with HBase
tables
+ * Query predicate push down via server side scan and get filters
+ * Optimizations for real time queries
+ * A high performance Thrift gateway
+ * A REST-ful Web service gateway that supports XML, Protobuf, and binary
data encoding options
+ * Cascading source and sink modules
+ * Extensible jruby-based (JIRB) shell
+ * Support for exporting metrics via the Hadoop metrics subsystem to files
or Ganglia; or via JMX</description>
<url>http://hbase.apache.org/</url>
</metadata>
<deps>
@@ -442,14 +450,14 @@ server.</description>
<hbase-doc>
<metadata>
<summary>Hbase Documentation</summary>
- <description>This package contains the HBase manual and
JavaDoc.</description>
+ <description>Documentation for Hbase</description>
<url>http://hbase.apache.org/</url>
</metadata>
</hbase-doc>
<hbase-master>
<metadata>
<summary>The Hadoop HBase master Server.</summary>
- <description>There is only one HMaster for a single HBase
deployment.</description>
+ <description>HMaster is the "master server" for a HBase. There is only one
HMaster for a single HBase deployment.</description>
<url>http://hbase.apache.org/</url>
</metadata>
<deps>
@@ -466,8 +474,7 @@ server.</description>
<hbase-regionserver>
<metadata>
<summary>The Hadoop HBase RegionServer server.</summary>
- <description>It checks in with the HMaster. There are many HRegionServers
in a single
- HBase deployment.</description>
+ <description>HRegionServer makes a set of HRegions available to clients.
It checks in with the HMaster. There are many HRegionServers in a single HBase
deployment.</description>
<url>http://hbase.apache.org/</url>
</metadata>
<deps>
@@ -484,8 +491,8 @@ server.</description>
<hbase-thrift>
<metadata>
<summary>The Hadoop HBase Thrift Interface</summary>
- <description>This package provides a Thrift service interface to the HBase
distributed
- database.</description>
+ <description>ThriftServer - this class starts up a Thrift server which
implements the Hbase API specified in the Hbase.thrift IDL file.
+"Thrift is a software framework for scalable cross-language services
development. It combines a powerful software stack with a code generation
engine to build services that work efficiently and seamlessly between C++,
Java, Python, PHP, and Ruby. Thrift was developed at Facebook, and we are now
releasing it as open source." For additional information, see
http://developers.facebook.com/thrift/. Facebook has announced their intent to
migrate Thrift into Apache Incubator.</description>
<url>http://hbase.apache.org/</url>
</metadata>
<deps>