Re: Data source aliasing

Patrick Wendell Thu, 30 Jul 2015 11:18:59 -0700

Yeah this could make sense - allowing data sources to register a short
name. What mechanism did you have in mind? To use the jar service loader?

The only issue is that there could be conflicts since many of these are
third party packages. If the same name were registered twice I'm not sure
what the best behavior would be. Ideally in my mind if the same shortname
were registered twice we'd force the user to use a fully qualified name and
say the short name is ambiguous.

Patrick
On Jul 30, 2015 9:44 AM, "Joseph Batchik" <josephbatc...@gmail.com> wrote:

> Hi all,
>
> There are now starting to be a lot of data source packages for Spark. A
> annoyance I see is that I have to type in the full class name like:
>
> sqlContext.read.format("com.databricks.spark.avro").load(path).
>
> Spark internally has formats such as "parquet" and "jdbc" registered and
> it would be nice to be able just to type in "avro", "redshift", etc. as
> well. Would it be a good idea to use something like a service loader to
> allow data sources defined in other packages to register themselves with
> Spark? I think that this would make it easier for end users. I would be
> interested in adding this, please let me know what you guys think.
>
> - Joe
>
>
>

Re: Data source aliasing

Reply via email to