Hi,
I have multiple spark deployments using mesos.
I use spark.executor.uri to fetch the spark distribution to executor node.
Every time I upgrade spark, I download the default distribution, and just
add to it custom spark-env.sh to spark/conf folder.
Further more, any change I want to do in
Hi,
I'm launching spark application on mesos cluster.
The namespace of the metric includes the framework id for driver metrics,
and both framework id and executor id for executor metrics.
These ids are obviously assigned by mesos, and they are not permanent -
re-registering the application would
n Yang <yy201...@gmail.com> wrote:
>>
>>> The default value for spark.shuffle.reduceLocality.enabled is true.
>>>
>>> To reduce surprise to users of 1.5 and earlier releases, should the
>>> default value be set to false ?
>>>
>>> On Mon, Feb
nd see
> the distribution change... fully reproducible
>
> On Sun, Feb 28, 2016 at 11:24 AM, Lior Chaga <lio...@taboola.com> wrote:
>
>> Hi,
>> I've experienced a similar problem upgrading from spark 1.4 to spark 1.6.
>> The data is not evenly distributed across exe
Hi,
I've experienced a similar problem upgrading from spark 1.4 to spark 1.6.
The data is not evenly distributed across executors, but in my case it also
reproduced with legacy mode.
Also tried 1.6.1 rc-1, with same results.
Still looking for resolution.
Lior
On Fri, Feb 19, 2016 at 2:01 AM,
Hi,
Using spark 1.4.0 in standalone mode, with following configuration:
SPARK_WORKER_OPTS=-Dspark.worker.cleanup.enabled=true
-Dspark.worker.cleanup.appDataTtl=86400
cleanup interval is set to default.
Application files are not deleted.
Using JavaSparkContext, and when the application ends it
Does spark HiveContext support the rank() ... distribute by syntax (as in
the following article-
http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/doing_rank_with_hive
)?
If not, how can it be achieved?
Thanks,
Lior
at 10:09 AM, Lior Chaga lio...@taboola.com wrote:
Hi,
Facing a bug with group by in SparkSQL (version 1.4).
Registered a JavaRDD with object containing integer fields as a table.
Then I'm trying to do a group by, with a constant value in the group by
fields:
SELECT primary_one
Hi,
Facing a bug with group by in SparkSQL (version 1.4).
Registered a JavaRDD with object containing integer fields as a table.
Then I'm trying to do a group by, with a constant value in the group by
fields:
SELECT primary_one, primary_two, 10 as num, SUM(measure) as total_measures
FROM tbl
Hi James,
There are a few configurations that you can try:
https://spark.apache.org/docs/latest/sql-programming-guide.html#other-configuration-options
From my experience, the codegen really boost things up. Just run
sqlContext.sql(spark.sql.codegen=true) before you execute your query. But
keep
I see that the pre-built distributions includes hive-shims-0.23 shaded in
spark-assembly jar (unlike when I make the distribution myself).
Does anyone knows what I should do to include the shims in my distribution?
On Thu, May 14, 2015 at 9:52 AM, Lior Chaga lio...@taboola.com wrote
After profiling with YourKit, I see there's an OutOfMemoryException in
context SQLContext.applySchema. Again, it's a very small RDD. Each executor
has 180GB RAM.
On Thu, May 14, 2015 at 8:53 AM, Lior Chaga lio...@taboola.com wrote:
Hi,
Using spark sql with HiveContext. Spark version is 1.3.1
Ultimately it was PermGen out of memory. I somehow missed it in the log
On Thu, May 14, 2015 at 9:24 AM, Lior Chaga lio...@taboola.com wrote:
After profiling with YourKit, I see there's an OutOfMemoryException in
context SQLContext.applySchema. Again, it's a very small RDD. Each executor
has
Hi,
Using spark sql with HiveContext. Spark version is 1.3.1
When running local spark everything works fine. When running on spark
cluster I get ClassNotFoundError org.apache.hadoop.hive.shims.Hadoop23Shims.
This class belongs to hive-shims-0.23, and is a runtime dependency for
spark-hive:
Hi,
I'd like to use a JavaRDD containing parameters for an SQL query, and use
SparkSQL jdbc to load data from mySQL.
Consider the following pseudo code:
JavaRDDString namesRdd = ... ;
...
options.put(url, jdbc:mysql://mysql?user=usr);
options.put(password, pass);
options.put(dbtable, (SELECT *
Using Spark 1.2.0. Tried to apply register an RDD and got:
scala.MatchError: class java.util.Date (of class java.lang.Class)
I see it was resolved in https://issues.apache.org/jira/browse/SPARK-2562
(included in 1.2.0)
Anyone encountered this issue?
Thanks,
Lior
;
}
public void setValue(Long value) {
this.value = value;
}
}
}
On Sun, Apr 19, 2015 at 4:27 PM, Lior Chaga lio...@taboola.com wrote:
Using Spark 1.2.0. Tried to apply register an RDD and got:
scala.MatchError: class java.util.Date (of class java.lang.Class)
I see
Hi,
Trying to run spark 1.2.1 w/ hadoop 1.0.4 on cluster and configure it to
run with log4j2.
Problem is that spark-assembly.jar contains log4j and slf4j classes
compatible with log4j 1.2 in it, and so it detects it should use log4j 1.2 (
18 matches
Mail list logo