Hey community,
I would like to *educate* myself about why all *sql implicits* (most
notably conversion to Dataset API) are imported from *instance* of
SparkSession and not using static imports.
Having this design one runs into problems like this
for example when do you Seq(1,2,3).toDF("a") it needs to get the
SparkSession from somewhere. by importing the implicits from
spark.implicits._ they have access to a SparkSession for operations like
this.
On Fri, Oct 14, 2016 at 4:42 PM, Jakub Dubovsky <
spark.dubovsky.ja...@gmail.com> wrote:
>
b
asically the implicit conversiosn that need it are rdd => dataset and seq
=> dataset
On Fri, Oct 14, 2016 at 5:47 PM, Koert Kuipers wrote:
> for example when do you Seq(1,2,3).toDF("a") it needs to get the
> SparkSession from somewhere. by importing the implicits from
>
about the stackoverflow question, do this:
def validateAndTransform(df: DataFrame) : DataFrame = {
import df.sparkSession.implicits._
...
}
On Fri, Oct 14, 2016 at 5:51 PM, Koert Kuipers wrote:
> b
> asically the implicit conversiosn that need it are rdd => dataset
Cody, the link is helpful. But I still have issues in my test.
I set "auto.offset.reset" to "earliest" and then create KafkaRDD using
OffsetRange which is out of range.
According to Kafka's document, I expect to get earliest offset of that
partition.
But I get below exception and it looks
40GB
2016-10-14 14:20 GMT+08:00 Felix Cheung :
> How big is the metrics_moveing_detection_cube table?
>
>
>
>
>
> On Thu, Oct 13, 2016 at 8:51 PM -0700, "Lantao Jin"
> wrote:
>
> sqlContext <- sparkRHive.init(sc)
> sqlString<-
> "SELECT
> key_id,
Okay, thank you! Can you say, when this feature will be released?
2016-10-13 16:29 GMT+02:00 Cody Koeninger :
> As Sean said, it's unreleased. If you want to try it out, build spark
>
> http://spark.apache.org/docs/latest/building-spark.html
>
> The easiest way to include
I can't be sure, no.
On Fri, Oct 14, 2016 at 3:06 AM, Julian Keppel
wrote:
> Okay, thank you! Can you say, when this feature will be released?
>
> 2016-10-13 16:29 GMT+02:00 Cody Koeninger :
>>
>> As Sean said, it's unreleased. If you want to try
For you or anyone else having issues with consumer rebalance, what are
your settings for
heartbeat.interval.ms
session.timeout.ms
group.max.session.timeout.ms
relative to your batch time?
On Tue, Oct 11, 2016 at 10:19 AM, static-max wrote:
> Hi,
>
> I run into the
If you're creating a Kafka RDD as opposed to a dstream, you're
explicitly specifying a beginning and ending offset, auto.offset.reset
doesn't really have anything to do with it.
If you look at that log line, it's trying to read the 2nd message out
of the 0th partition of mytopic2, and not able to
On 13 Oct 2016, at 10:50, dbolshak
> wrote:
Hello community,
We've a challenge and no ideas how to solve it.
The problem,
Say we have the following environment:
1. `cluster A`, the cluster does not use kerberos and we use it as a
11 matches
Mail list logo