Can DataFrames with different schema be joined efficiently

2015-09-23 Thread MrJew
Hello, I'm using spark streaming to handle quite big data flow. I'm solving a problem where we are inferring the type from the data ( we need more specific data types than what JSON provides ). And quite often there is a small difference between the schemas that we get. Saving to parquet files

Standalone Cluster Local Authentication

2015-08-03 Thread MrJew
Hello, Similar to other cluster systems e.g Zookeeper, Hazelcast. Spark has the problem that is protected from the outside world however anyone having access to the host can run a spark node without the need for authentication. Currently we are using Spark 1.3.1. Is there a way to enable