Hi,
I have multiple spark deployments using mesos.
I use spark.executor.uri to fetch the spark distribution to executor node.
Every time I upgrade spark, I download the default distribution, and just
add to it custom spark-env.sh to spark/conf folder.
Further more, any change I want to do in spa
Hi,
I'm launching spark application on mesos cluster.
The namespace of the metric includes the framework id for driver metrics,
and both framework id and executor id for executor metrics.
These ids are obviously assigned by mesos, and they are not permanent -
re-registering the application would re
alue for spark.shuffle.reduceLocality.enabled is true.
>>>
>>> To reduce surprise to users of 1.5 and earlier releases, should the
>>> default value be set to false ?
>>>
>>> On Mon, Feb 29, 2016 at 5:38 AM, Lior Chaga wrote:
>>>
bution change... fully reproducible
>
> On Sun, Feb 28, 2016 at 11:24 AM, Lior Chaga wrote:
>
>> Hi,
>> I've experienced a similar problem upgrading from spark 1.4 to spark 1.6.
>> The data is not evenly distributed across executors, but in my case it
>> also repr
Hi,
I've experienced a similar problem upgrading from spark 1.4 to spark 1.6.
The data is not evenly distributed across executors, but in my case it also
reproduced with legacy mode.
Also tried 1.6.1 rc-1, with same results.
Still looking for resolution.
Lior
On Fri, Feb 19, 2016 at 2:01 AM, Koe
Hi,
Using spark 1.4.0 in standalone mode, with following configuration:
SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true
-Dspark.worker.cleanup.appDataTtl=86400"
cleanup interval is set to default.
Application files are not deleted.
Using JavaSparkContext, and when the application ends it
Does spark HiveContext support the rank() ... distribute by syntax (as in
the following article-
http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/doing_rank_with_hive
)?
If not, how can it be achieved?
Thanks,
Lior
combinations.
On Wed, Jul 15, 2015 at 10:09 AM, Lior Chaga wrote:
> Hi,
>
> Facing a bug with group by in SparkSQL (version 1.4).
> Registered a JavaRDD with object containing integer fields as a table.
>
> Then I'm trying to do a group by, with a constant value in the g
Hi,
Facing a bug with group by in SparkSQL (version 1.4).
Registered a JavaRDD with object containing integer fields as a table.
Then I'm trying to do a group by, with a constant value in the group by
fields:
SELECT primary_one, primary_two, 10 as num, SUM(measure) as total_measures
FROM tbl
GRO
Hi James,
There are a few configurations that you can try:
https://spark.apache.org/docs/latest/sql-programming-guide.html#other-configuration-options
>From my experience, the codegen really boost things up. Just run
sqlContext.sql("spark.sql.codegen=true") before you execute your query. But
keep
I see that the pre-built distributions includes hive-shims-0.23 shaded in
spark-assembly jar (unlike when I make the distribution myself).
Does anyone knows what I should do to include the shims in my distribution?
On Thu, May 14, 2015 at 9:52 AM, Lior Chaga wrote:
> Ultimately it was Perm
Ultimately it was PermGen out of memory. I somehow missed it in the log
On Thu, May 14, 2015 at 9:24 AM, Lior Chaga wrote:
> After profiling with YourKit, I see there's an OutOfMemoryException in
> context SQLContext.applySchema. Again, it's a very small RDD. Each executor
After profiling with YourKit, I see there's an OutOfMemoryException in
context SQLContext.applySchema. Again, it's a very small RDD. Each executor
has 180GB RAM.
On Thu, May 14, 2015 at 8:53 AM, Lior Chaga wrote:
> Hi,
>
> Using spark sql with HiveContext. Spark version is 1
Hi,
Using spark sql with HiveContext. Spark version is 1.3.1
When running local spark everything works fine. When running on spark
cluster I get ClassNotFoundError org.apache.hadoop.hive.shims.Hadoop23Shims.
This class belongs to hive-shims-0.23, and is a runtime dependency for
spark-hive:
[INFO]
Hi,
I'd like to use a JavaRDD containing parameters for an SQL query, and use
SparkSQL jdbc to load data from mySQL.
Consider the following pseudo code:
JavaRDD namesRdd = ... ;
...
options.put("url", "jdbc:mysql://mysql?user=usr");
options.put("password", "pass");
options.put("dbtable", "(SELEC
lue;
}
public void setValue(Long value) {
this.value = value;
}
}
}
On Sun, Apr 19, 2015 at 4:27 PM, Lior Chaga wrote:
> Using Spark 1.2.0. Tried to apply register an RDD and got:
> scala.MatchError: class java.util.Date (of class java.lang.
Using Spark 1.2.0. Tried to apply register an RDD and got:
scala.MatchError: class java.util.Date (of class java.lang.Class)
I see it was resolved in https://issues.apache.org/jira/browse/SPARK-2562
(included in 1.2.0)
Anyone encountered this issue?
Thanks,
Lior
Hi,
Trying to run spark 1.2.1 w/ hadoop 1.0.4 on cluster and configure it to
run with log4j2.
Problem is that spark-assembly.jar contains log4j and slf4j classes
compatible with log4j 1.2 in it, and so it detects it should use log4j 1.2 (
https://github.com/apache/spark/blob/54e7b456dd56c9e52132154
18 matches
Mail list logo