Hadoop and Open Data (CKAN.org).

2014-09-04 Thread Henrik Aagaard Jørgensen
Dear all, I'm very new to Hadoop as I'm still trying to grasp its value and purpose. I do hope my question on this mailing list is OK. I manage our open data platform at our municipality, using CKAN.org. It works very well for its purpose of showing data and adding API's to data. However,

Re: Hadoop and Open Data (CKAN.org).

2014-09-04 Thread Alec Ten Harmsel
I would recommend using Hadoop only if you are ingesting a lot of data and you need reasonable performance at scale. I would recommend starting with using insert language/tool of choice to ingest and transform data until that process starts taking too long. For example, one of our researchers at

Re: Hadoop and Open Data (CKAN.org).

2014-09-04 Thread Mohan Radhakrishnan
I understand that coding MR jobs using a language is required but if we are just processing large amounts of data (Machine Learning for example) we could use Pig. I recently processed 0.25 TB on AWS clusters in a reasonably short time. In this case the development effort is very less. Thanks,