Probably for the XML case the best resource I found iare
http://stevenskelton.ca/real-time-data-mining-spark/ and
http://blog.cloudera.com/blog/2014/03/why-apache-spark-is-a-crossover-hit-for-data-scientists/
.
And about JSON? If I have to work with JSON and I want to use fasterxml
implementation?
Any help about this...?
On Apr 9, 2014 9:19 AM, "Flavio Pompermaier" wrote:
> Hi to everybody,
>
> In my current scenario I have complex objects stored as xml in an HBase
> Table.
> What's the best strategy to work with them? My final goal would be to
> define operators on those objects (like fil
Hi to everybody,
In my current scenario I have complex objects stored as xml in an HBase
Table.
What's the best strategy to work with them? My final goal would be to
define operators on those objects (like filter, equals, append, join,
merge, etc) and then work with multiple RDDs to perform some k