I see, thanks a lot for the clarifications.
--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
I don’t have a good answer for that yet. My initial motivation here is
mainly to get consensus around this:
- DSv2 should support table names through SQL and the API, and
- It should use the existing classes in the logical plan (i.e.,
TableIdentifier)
To contrast, I think Wenchen is
I am definitely in favor of first-class / consistent support for tables and
data sources.
One thing that is not clear to me from this proposal is exactly what the
interfaces are between:
- Spark
- A (The?) metastore
- A data source
If we pass in the table identifier is the data source then
>
> So here are my recommendations for moving forward, with DataSourceV2 as a
> starting point:
>
>1. Use well-defined logical plan nodes for all high-level operations:
>insert, create, CTAS, overwrite table, etc.
>2. Use rules that match on these high-level plan nodes, so that it
>
Hey,
I wanted to see that for a long time, too. :) If you'd plan on
implementing this, I could contribute.
However, I am not too familiar with variational inference for the GPs
which is what you would need I guess.
Or do you think it is feasible to compute the full kernel for the GP?
Cheers,
There are two main ways to load tables in Spark: by name (db.table) and by
a path. Unfortunately, the integration for DataSourceV2 has no support for
identifying tables by name.
I propose supporting the use of TableIdentifier, which is the standard way
to pass around table names.
The reason I
I am using spark 2.1.0
On Fri, Feb 2, 2018 at 5:08 PM, Pralabh Kumar
wrote:
> Hi
>
> I am performing broadcast join where my small table is 1 gb . I am
> getting following error .
>
> I am using
>
>
> org.apache.spark.SparkException:
> . Available: 0, required:
Hi
I am performing broadcast join where my small table is 1 gb . I am getting
following error .
I am using
org.apache.spark.SparkException:
. Available: 0, required: 28869232. To avoid this, increase
spark.kryoserializer.buffer.max value
I increase the value to
Hi Reynold,
That in general is a very good idea to get the community engaged (even if
most people would just listen / hide in the dark like myself). I know no
other open source project at ASF or elsewhere that such an initiative was
even tried. Kudos for the idea!
Pozdrawiam,
Jacek Laskowski