One of the ways of ingesting data into HDFS is to use Spark JDBC connection to connect to soured and ingest data into the underlying files or Hive tables.
One question has come out is under controlled test conditions what would the measurements of io, cpu etc across the cluster. Assuming not using UNUX tools such as Nagios etc, are they tools that can be deployed for spark cluster itself? I guess top/htop can be used but those are available anyway. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.