All:
This may be off topic for Spark, but I'm sure several of you might have
used some form of this as part of your BigData implementations. So, wanted
to reach out.
As part of the Data Lake and Data Processing (by Spark as an example), we
might end up different form-factors for the files (via,
All:
If this question was already discussed, please let me know. I can try to
look into the archive.
Data Characteristics:
entity_id date fact_1 fact_2 fact_N derived_1 derived_2 derived_X
a) There are 1000s of such entities in the system
b) Each one has various Fact attributes per
All,
This is a theoretical question at this point of time. Wanted to pose this
question, before spending too much time to figure it out. Advance apologies
if this is not the right forum to ask this question.
Use-case:
- Migration from one cluster manager to another (for ex. Spark stand-alone
to
Hi,
I know this is a broad question. If this is not the right forum, appreciate
if you can point to other sites/areas that may be helpful.
Before posing this question, I did use our friend Google, but sanitizing
the query results from my need angle hasn't been easy.
Who I am:
- Have done