Read from AWS s3 with out having to hard-code sensitive keys

2016-01-11 Thread Krishna Rao
Hi all, Is there a method for reading from s3 without having to hard-code keys? The only 2 ways I've found both require this: 1. Set conf in code e.g.: sc.hadoopConfiguration().set("fs.s3.awsAccessKeyId", "") sc.hadoopConfiguration().set("fs.s3.awsSecretAccessKey", "") 2. Set keys in URL, e.g.:

Run ad-hoc queries at runtime against cached RDDs

2015-12-14 Thread Krishna Rao
Hi all, What's the best way to run ad-hoc queries against a cached RDDs? For example, say I have an RDD that has been processed, and persisted to memory-only. I want to be able to run a count (actually "countApproxDistinct") after filtering by an, at compile time, unknown (specified by query)

Re: Run ad-hoc queries at runtime against cached RDDs

2015-12-14 Thread Krishna Rao
ttle bit more on the use case? It looks a little bit > like an abuse of Spark in general . Interactive queries that are not > suitable for in-memory batch processing might be better supported by ignite > that has in-memory indexes, concept of hot, warm, cold data etc. or hive on > tez+llap