Re: Announcement & Proposal: HDFS tests on large cluster.

Chamikara Jayalath Wed, 06 Jun 2018 09:57:24 -0700

On Wed, Jun 6, 2018 at 5:19 AM Łukasz Gajowy <lukasz.gaj...@gmail.com>
wrote:


> Hi all,
>
> I'd like to announce that thanks to Kamil Szewczyk, since this PR
> <https://github.com/apache/beam/pull/5441> we have 4 file-based HDFS
> tests run on a "Large HDFS Cluster"! More specifically I mean:
>
> - beam_PerformanceTests_Compressed_TextIOIT_HDFS
> - beam_PerformanceTests_Compressed_TextIOIT_HDFS
> - beam_PerformanceTests_AvroIOIT_HDFS
> - beam_PerformanceTests_XmlIOIT_HDFS
>
> The "Large HDFS Cluster" (in contrast to the small one, that is also
> available) consists of a master node and three data nodes all in separate
> pods. Thanks to that we can mimic more real-life scenarios on HDFS (3
> distributed nodes) and possibly run bigger tests so there's progress! :)
>
>
This is great. Also, looks like results are available in test dashboard:
https://apache-beam-testing.appspot.com/explore?dashboard=5755685136498688
(BTW we should add information about dashboard to the testing doc:
https://beam.apache.org/contribute/testing/)

I'm currently working on proper documentation for this so that everyone can
> use it in IOITs (stay tuned).
>
> Regarding the above, I'd like to propose scaling up the
> Kubernetes cluster. AFAIK, currently, it consists of 1 node. If we scale it
> up to eg. 3 nodes, the HDFS' kubernetes pods will distribute themselves on
> different machines rather than one, making it an even more "real-life"
> scenario (possibly more efficient?). Moreover, other Performance Tests
> (such as JDBC or mongo) could use more space for their infrastructure as
> well. Scaling up the cluster could also turn out useful for some future
> efforts, like BEAM-4508[1] (adapting and running some old IOITs on
> Jenkins).
>
> WDYT? Are there any objections?
>
+1 for increasing the size of Kubernetes cluster.

>
> [1] https://issues.apache.org/jira/browse/BEAM-4508
>
>

Re: Announcement & Proposal: HDFS tests on large cluster.

Reply via email to