Hi all,
I am writing this email to both user-group and dev-group since this is
applicable to both.
I am now working on Spark XML datasource (
https://github.com/databricks/spark-xml).
This uses a InputFormat implementation which I downgraded to Hadoop 1.x for
version compatibility.
However, I
Hi Kostas
With regards to your *second* point. I believe that requiring from the user
apps to explicitly declare their dependencies is the most clear API
approach when it comes to classpath and classloading.
However what about the following API: *SparkContext.addJar(String
pathToJar)* . *Is this
I don’t think there is performance difference between 1.x API and 2.x API.
but it’s not a big issue for your change, only
com.databricks.hadoop.mapreduce.lib.input.XmlInputFormat.java
Thank you for your reply!
I have already done the change locally. So for changing it would be fine.
I just wanted to be sure which way is correct.
On 9 Dec 2015 18:20, "Fengdong Yu" wrote:
> I don’t think there is performance difference between 1.x API and 2.x API.
>
Hi,
I was wondering what the "official" view is on feature parity between SQL
and DF apis. Docs are pretty sparse on the SQL front, and it seems that
some features are only supported at various times in only one of Spark SQL
dialect, HiveQL dialect and DF API. DF.cube(), DISTRIBUTE BY, CACHE LAZY
reminder! this is happening tomorrow morning.
On Wed, Dec 2, 2015 at 7:20 PM, shane knapp wrote:
> there's Yet Another Jenkins Security Advisory[tm], and a big release
> to patch it all coming out next wednesday.
>
> to that end i will be performing a jenkins update, as
I don't plan to abandon HiveQL compatibility, but I'd like to see us move
towards something with more SQL compliance (perhaps just newer versions of
the HiveQL parser). Exactly which parser will do that for us is under
investigation.
On Wed, Dec 9, 2015 at 11:02 AM, Xiao Li
Yeah, this is the same idea behind having Travis cache the ivy2 folder to
speed up builds. In Amplab Jenkins each individual build workspace has its
own individual Ivy cache which is preserved across build runs but which is
only used by one active run at a time in order to avoid SBT ivy lock
Hi, Michael,
Does that mean SqlContext will be built on HiveQL in the near future?
Thanks,
Xiao Li
2015-12-09 10:36 GMT-08:00 Michael Armbrust :
> I think that it is generally good to have parity when the functionality is
> useful. However, in some cases various
hi,
I met following exception when the driver program tried to recover from
checkpoint, looks like the logic relies on zeroTime being set which doesn't
seem to happen here. am I missing anything or is it a bug in 1.4.1?
org.apache.spark.SparkException:
Hi,
Just use ""objectFile" instead of "objectFile[PipelineModel]" for callJMethod.
You can take the objectFile() in context.R as example.
Since the SparkContext created in SparkR is actually a JavaSparkContext, there
is no need to pass the implicit ClassTag.
-Original Message-
From:
Is this a candidate for the version 1.X/2.0 split?
2015-12-09 16:29 GMT-08:00 Michael Armbrust :
> Yeah, I would like to address any actual gaps in functionality that are
> present.
>
> On Wed, Dec 9, 2015 at 4:24 PM, Cristian Opris > wrote:
never mind, one of my peers correct the driver program for me - all dstream
operations need to be within the scope of getOrCreate API
On Wed, Dec 9, 2015 at 3:32 PM, Renyi Xiong wrote:
> following scala program throws same exception, I know people are running
> streaming
The SparkR callJMethod can only invoke methods as they show up in the
Java byte code. So in this case you'll need to check the SparkContext
byte code (with javap or something like that) to see how that method
looks. My guess is the type is passed in as a class tag argument, so
you'll need to do
Yeah, I would like to address any actual gaps in functionality that are
present.
On Wed, Dec 9, 2015 at 4:24 PM, Cristian Opris
wrote:
> The reason I'm asking is because it's important in larger projects to be
> able to stick to a particular programming style. Some
following scala program throws same exception, I know people are running
streaming jobs against kafka, I must be missing something. any idea why?
package org.apache.spark.streaming.api.csharp
import java.util.HashMap
import kafka.serializer.{DefaultDecoder, Decoder, StringDecoder}
import
That sounds great! When it is decided, please let us know and we can add
more features and make it ANSI SQL compliant.
Thank you!
Xiao Li
2015-12-09 11:31 GMT-08:00 Michael Armbrust :
> I don't plan to abandon HiveQL compatibility, but I'd like to see us move
> towards
here's the security advisory for the update:
https://wiki.jenkins-ci.org/display/SECURITY/Jenkins+Security+Advisory+2015-12-09
On Wed, Dec 9, 2015 at 9:55 AM, shane knapp wrote:
> reminder! this is happening tomorrow morning.
>
> On Wed, Dec 2, 2015 at 7:20 PM, shane knapp
18 matches
Mail list logo