Sqoop’s incremental data fetch will reduce the data size you need to pull from
source, but then by the time that incremental data fetch is complete, is it not
current again, if velocity of the data is high?
May be you can put a trigger in Postgres to send data to the big data cluster
as
I can't migrate this PostgreSQL data since lots of system using it,but I can
take this data to some NOSQL like base and query the Hbase, but here issue is
How can I make sure that Hbase has upto date data?
Is velocity an issue in Postgres that your data would become stale as soon as
it
Ravi
Spark (or in that case Big Data solutions like Hive) is suited for large
analytical loads, where the “scaling up” starts to pale in comparison to
“Scaling out” with regards to performance, versatility(types of data) and cost.
Without going into the details of MsSQL architecture, there