Assume I want to make a PairRDD whose keys are S3 URLs and whose values are 
Strings holding the contents of those (UTF-8) files, but NOT split into lines. 
Are there length limits on those files/Strings? 1 MB? 16 MB? 4 GB? 1 TB?
Similarly, can such a thing be registered as a table so that I can use substr() 
to pick out pieces of the string?

Thanks,
Ron

Reply via email to