Re: If you use Spark 1.5 and disabled Tungsten mode ...

2015-11-05 Thread Sjoerd Mulder
Hi Reynold,

I had version 2.6.1 in my project which was provided by the fine folks
from spring-boot-dependencies.

Now have overridden it to 2.7.8 :)

Sjoerd

2015-11-01 8:22 GMT+01:00 Reynold Xin :

> Thanks for reporting it, Sjoerd. You might have a different version of
> Janino brought in from somewhere else.
>
> This should fix your problem: https://github.com/apache/spark/pull/9372
>
> Can you give it a try?
>
>
>
> On Tue, Oct 27, 2015 at 9:12 PM, Sjoerd Mulder 
> wrote:
>
>> No the job actually doesn't fail, but since our tests is generating all
>> these stacktraces i have disabled the tungsten mode just to be sure (and
>> don't have gazilion stacktraces in production).
>>
>> 2015-10-27 20:59 GMT+01:00 Josh Rosen :
>>
>>> Hi Sjoerd,
>>>
>>> Did your job actually *fail* or did it just generate many spurious
>>> exceptions? While the stacktrace that you posted does indicate a bug, I
>>> don't think that it should have stopped query execution because Spark
>>> should have fallen back to an interpreted code path (note the "Failed
>>> to generate ordering, fallback to interpreted" in the error message).
>>>
>>> On Tue, Oct 27, 2015 at 12:56 PM Sjoerd Mulder 
>>> wrote:
>>>
 I have disabled it because of it started generating ERROR's when
 upgrading from Spark 1.4 to 1.5.1

 2015-10-27T20:50:11.574+0100 ERROR TungstenSort.newOrdering() - Failed
 to generate ordering, fallback to interpreted
 java.util.concurrent.ExecutionException: java.lang.Exception: failed to
 compile: org.codehaus.commons.compiler.CompileException: Line 15, Column 9:
 Invalid character input "@" (character code 64)

 public SpecificOrdering
 generate(org.apache.spark.sql.catalyst.expressions.Expression[] expr) {
   return new SpecificOrdering(expr);
 }

 class SpecificOrdering extends
 org.apache.spark.sql.catalyst.expressions.codegen.BaseOrdering {

   private org.apache.spark.sql.catalyst.expressions.Expression[]
 expressions;



   public
 SpecificOrdering(org.apache.spark.sql.catalyst.expressions.Expression[]
 expr) {
 expressions = expr;

   }

   @Override
   public int compare(InternalRow a, InternalRow b) {
 InternalRow i = null;  // Holds current row being evaluated.

 i = a;
 boolean isNullA2;
 long primitiveA3;
 {
   /* input[2, LongType] */

   boolean isNull0 = i.isNullAt(2);
   long primitive1 = isNull0 ? -1L : (i.getLong(2));

   isNullA2 = isNull0;
   primitiveA3 = primitive1;
 }
 i = b;
 boolean isNullB4;
 long primitiveB5;
 {
   /* input[2, LongType] */

   boolean isNull0 = i.isNullAt(2);
   long primitive1 = isNull0 ? -1L : (i.getLong(2));

   isNullB4 = isNull0;
   primitiveB5 = primitive1;
 }
 if (isNullA2 && isNullB4) {
   // Nothing
 } else if (isNullA2) {
   return 1;
 } else if (isNullB4) {
   return -1;
 } else {
   int comp = (primitiveA3 > primitiveB5 ? 1 : primitiveA3 <
 primitiveB5 ? -1 : 0);
   if (comp != 0) {
 return -comp;
   }
 }

 return 0;
   }
 }

 at
 org.spark-project.guava.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
 at
 org.spark-project.guava.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)
 at
 org.spark-project.guava.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
 at
 org.spark-project.guava.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135)
 at
 org.spark-project.guava.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410)
 at
 org.spark-project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2380)
 at
 org.spark-project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
 at
 org.spark-project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
 at org.spark-project.guava.cache.LocalCache.get(LocalCache.java:4000)
 at
 org.spark-project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
 at
 org.spark-project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
 at
 org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.compile(CodeGenerator.scala:362)
 at
 org.apache.spark.sql.catalyst.expressions.codegen.GenerateOrdering$.create(GenerateOrdering.scala:139)
 at
 org.apache.spark.sql.catalyst.expressions.codegen.GenerateOrdering$.create(GenerateOrdering.scala:37)
 at
 org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:425)
 at
 

Re: If you use Spark 1.5 and disabled Tungsten mode ...

2015-11-01 Thread Reynold Xin
Thanks for reporting it, Sjoerd. You might have a different version of
Janino brought in from somewhere else.

This should fix your problem: https://github.com/apache/spark/pull/9372

Can you give it a try?



On Tue, Oct 27, 2015 at 9:12 PM, Sjoerd Mulder 
wrote:

> No the job actually doesn't fail, but since our tests is generating all
> these stacktraces i have disabled the tungsten mode just to be sure (and
> don't have gazilion stacktraces in production).
>
> 2015-10-27 20:59 GMT+01:00 Josh Rosen :
>
>> Hi Sjoerd,
>>
>> Did your job actually *fail* or did it just generate many spurious
>> exceptions? While the stacktrace that you posted does indicate a bug, I
>> don't think that it should have stopped query execution because Spark
>> should have fallen back to an interpreted code path (note the "Failed to
>> generate ordering, fallback to interpreted" in the error message).
>>
>> On Tue, Oct 27, 2015 at 12:56 PM Sjoerd Mulder 
>> wrote:
>>
>>> I have disabled it because of it started generating ERROR's when
>>> upgrading from Spark 1.4 to 1.5.1
>>>
>>> 2015-10-27T20:50:11.574+0100 ERROR TungstenSort.newOrdering() - Failed
>>> to generate ordering, fallback to interpreted
>>> java.util.concurrent.ExecutionException: java.lang.Exception: failed to
>>> compile: org.codehaus.commons.compiler.CompileException: Line 15, Column 9:
>>> Invalid character input "@" (character code 64)
>>>
>>> public SpecificOrdering
>>> generate(org.apache.spark.sql.catalyst.expressions.Expression[] expr) {
>>>   return new SpecificOrdering(expr);
>>> }
>>>
>>> class SpecificOrdering extends
>>> org.apache.spark.sql.catalyst.expressions.codegen.BaseOrdering {
>>>
>>>   private org.apache.spark.sql.catalyst.expressions.Expression[]
>>> expressions;
>>>
>>>
>>>
>>>   public
>>> SpecificOrdering(org.apache.spark.sql.catalyst.expressions.Expression[]
>>> expr) {
>>> expressions = expr;
>>>
>>>   }
>>>
>>>   @Override
>>>   public int compare(InternalRow a, InternalRow b) {
>>> InternalRow i = null;  // Holds current row being evaluated.
>>>
>>> i = a;
>>> boolean isNullA2;
>>> long primitiveA3;
>>> {
>>>   /* input[2, LongType] */
>>>
>>>   boolean isNull0 = i.isNullAt(2);
>>>   long primitive1 = isNull0 ? -1L : (i.getLong(2));
>>>
>>>   isNullA2 = isNull0;
>>>   primitiveA3 = primitive1;
>>> }
>>> i = b;
>>> boolean isNullB4;
>>> long primitiveB5;
>>> {
>>>   /* input[2, LongType] */
>>>
>>>   boolean isNull0 = i.isNullAt(2);
>>>   long primitive1 = isNull0 ? -1L : (i.getLong(2));
>>>
>>>   isNullB4 = isNull0;
>>>   primitiveB5 = primitive1;
>>> }
>>> if (isNullA2 && isNullB4) {
>>>   // Nothing
>>> } else if (isNullA2) {
>>>   return 1;
>>> } else if (isNullB4) {
>>>   return -1;
>>> } else {
>>>   int comp = (primitiveA3 > primitiveB5 ? 1 : primitiveA3 <
>>> primitiveB5 ? -1 : 0);
>>>   if (comp != 0) {
>>> return -comp;
>>>   }
>>> }
>>>
>>> return 0;
>>>   }
>>> }
>>>
>>> at
>>> org.spark-project.guava.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
>>> at
>>> org.spark-project.guava.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)
>>> at
>>> org.spark-project.guava.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
>>> at
>>> org.spark-project.guava.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135)
>>> at
>>> org.spark-project.guava.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410)
>>> at
>>> org.spark-project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2380)
>>> at
>>> org.spark-project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
>>> at
>>> org.spark-project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
>>> at org.spark-project.guava.cache.LocalCache.get(LocalCache.java:4000)
>>> at
>>> org.spark-project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
>>> at
>>> org.spark-project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
>>> at
>>> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.compile(CodeGenerator.scala:362)
>>> at
>>> org.apache.spark.sql.catalyst.expressions.codegen.GenerateOrdering$.create(GenerateOrdering.scala:139)
>>> at
>>> org.apache.spark.sql.catalyst.expressions.codegen.GenerateOrdering$.create(GenerateOrdering.scala:37)
>>> at
>>> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:425)
>>> at
>>> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:422)
>>> at
>>> org.apache.spark.sql.execution.SparkPlan.newOrdering(SparkPlan.scala:294)
>>> at org.apache.spark.sql.execution.TungstenSort.org
>>> $apache$spark$sql$execution$TungstenSort$$preparePartition$1(sort.scala:131)
>>> at
>>> 

Re: If you use Spark 1.5 and disabled Tungsten mode ...

2015-10-21 Thread Jerry Lam
Hi guys,

There is another memory issue. Not sure if this is related to Tungsten this
time because I have it disable (spark.sql.tungsten.enabled=false). It
happens more there are too many tasks running (300). I need to limit the
number of task to avoid this. The executor has 6G. Spark 1.5.1 is been used.

Best Regards,

Jerry

org.apache.spark.SparkException: Task failed while writing rows.
at 
org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:393)
at 
org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
at 
org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Unable to acquire 67108864 bytes of memory
at 
org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.acquireNewPage(UnsafeExternalSorter.java:351)
at 
org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.(UnsafeExternalSorter.java:138)
at 
org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.create(UnsafeExternalSorter.java:106)
at 
org.apache.spark.sql.execution.UnsafeKVExternalSorter.(UnsafeKVExternalSorter.java:74)
at 
org.apache.spark.sql.execution.UnsafeKVExternalSorter.(UnsafeKVExternalSorter.java:56)
at 
org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:339)


On Tue, Oct 20, 2015 at 9:10 PM, Reynold Xin <r...@databricks.com> wrote:

> With Jerry's permission, sending this back to the dev list to close the
> loop.
>
>
> -- Forwarded message --
> From: Jerry Lam <chiling...@gmail.com>
> Date: Tue, Oct 20, 2015 at 3:54 PM
> Subject: Re: If you use Spark 1.5 and disabled Tungsten mode ...
> To: Reynold Xin <r...@databricks.com>
>
>
> Yup, coarse grained mode works just fine. :)
> The difference is that by default, coarse grained mode uses 1 core per
> task. If I constraint 20 cores in total, there can be only 20 tasks running
> at the same time. However, with fine grained, I cannot set the total number
> of cores and therefore, it could be +200 tasks running at the same time (It
> is dynamic). So it might be the calculation of how much memory to acquire
> fail when the number of cores cannot be known ahead of time because you
> cannot make the assumption that X tasks running in an executor? Just my
> guess...
>
>
> On Tue, Oct 20, 2015 at 6:24 PM, Reynold Xin <r...@databricks.com> wrote:
>
>> Can you try coarse-grained mode and see if it is the same?
>>
>>
>> On Tue, Oct 20, 2015 at 3:20 PM, Jerry Lam <chiling...@gmail.com> wrote:
>>
>>> Hi Reynold,
>>>
>>> Yes, I'm using 1.5.1. I see them quite often. Sometimes it recovers but
>>> sometimes it does not. For one particular job, it failed all the time with
>>> the acquire-memory issue. I'm using spark on mesos with fine grained mode.
>>> Does it make a difference?
>>>
>>> Best Regards,
>>>
>>> Jerry
>>>
>>> On Tue, Oct 20, 2015 at 5:27 PM, Reynold Xin <r...@databricks.com>
>>> wrote:
>>>
>>>> Jerry - I think that's been fixed in 1.5.1. Do you still see it?
>>>>
>>>> On Tue, Oct 20, 2015 at 2:11 PM, Jerry Lam <chiling...@gmail.com>
>>>> wrote:
>>>>
>>>>> I disabled it because of the "Could not acquire 65536 bytes of
>>>>> memory". It happens to fail the job. So for now, I'm not touching it.
>>>>>
>>>>> On Tue, Oct 20, 2015 at 4:48 PM, charmee <charm...@gmail.com> wrote:
>>>>>
>>>>>> We had disabled tungsten after we found few performance issues, but
>>>>>> had to
>>>>>> enable it back because we found that when we had large number of
>>>>>> group by
>>>>>> fields, if tungsten is disabled the shuffle keeps failing.
>>>>>>
>>>>>> Here is an excerpt from one of our engineers with his analysis.
>>>>>>
>>>>>> With Tungsten

Re: If you use Spark 1.5 and disabled Tungsten mode ...

2015-10-21 Thread Reynold Xin
Is this still Mesos fine grained mode?


On Wed, Oct 21, 2015 at 1:16 PM, Jerry Lam <chiling...@gmail.com> wrote:

> Hi guys,
>
> There is another memory issue. Not sure if this is related to Tungsten
> this time because I have it disable (spark.sql.tungsten.enabled=false). It
> happens more there are too many tasks running (300). I need to limit the
> number of task to avoid this. The executor has 6G. Spark 1.5.1 is been used.
>
> Best Regards,
>
> Jerry
>
> org.apache.spark.SparkException: Task failed while writing rows.
>   at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:393)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>   at org.apache.spark.scheduler.Task.run(Task.scala:88)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Unable to acquire 67108864 bytes of memory
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.acquireNewPage(UnsafeExternalSorter.java:351)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.(UnsafeExternalSorter.java:138)
>   at 
> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.create(UnsafeExternalSorter.java:106)
>   at 
> org.apache.spark.sql.execution.UnsafeKVExternalSorter.(UnsafeKVExternalSorter.java:74)
>   at 
> org.apache.spark.sql.execution.UnsafeKVExternalSorter.(UnsafeKVExternalSorter.java:56)
>   at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:339)
>
>
> On Tue, Oct 20, 2015 at 9:10 PM, Reynold Xin <r...@databricks.com> wrote:
>
>> With Jerry's permission, sending this back to the dev list to close the
>> loop.
>>
>>
>> ------ Forwarded message --
>> From: Jerry Lam <chiling...@gmail.com>
>> Date: Tue, Oct 20, 2015 at 3:54 PM
>> Subject: Re: If you use Spark 1.5 and disabled Tungsten mode ...
>> To: Reynold Xin <r...@databricks.com>
>>
>>
>> Yup, coarse grained mode works just fine. :)
>> The difference is that by default, coarse grained mode uses 1 core per
>> task. If I constraint 20 cores in total, there can be only 20 tasks running
>> at the same time. However, with fine grained, I cannot set the total number
>> of cores and therefore, it could be +200 tasks running at the same time (It
>> is dynamic). So it might be the calculation of how much memory to acquire
>> fail when the number of cores cannot be known ahead of time because you
>> cannot make the assumption that X tasks running in an executor? Just my
>> guess...
>>
>>
>> On Tue, Oct 20, 2015 at 6:24 PM, Reynold Xin <r...@databricks.com> wrote:
>>
>>> Can you try coarse-grained mode and see if it is the same?
>>>
>>>
>>> On Tue, Oct 20, 2015 at 3:20 PM, Jerry Lam <chiling...@gmail.com> wrote:
>>>
>>>> Hi Reynold,
>>>>
>>>> Yes, I'm using 1.5.1. I see them quite often. Sometimes it recovers but
>>>> sometimes it does not. For one particular job, it failed all the time with
>>>> the acquire-memory issue. I'm using spark on mesos with fine grained mode.
>>>> Does it make a difference?
>>>>
>>>> Best Regards,
>>>>
>>>> Jerry
>>>>
>>>> On Tue, Oct 20, 2015 at 5:27 PM, Reynold Xin <r...@databricks.com>
>>>> wrote:
>>>>
>>>>> Jerry - I think that's been fixed in 1.5.1. Do you still see it?
>>>>>
>>>>> On Tue, Oct 20, 2015 at 2:11 PM, Jerry Lam <chiling...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I disabled it because of the "Could not acquire 65536 bytes of
>>>>>> memory". It happens to fail the job. So for now, I'm not touching it.
>>>>>>
>>>>>> On Tue, Oct 20, 2015 at 4:48 PM, charmee <charm...@gmail.com> wrote:
>>>>>>

Re: If you use Spark 1.5 and disabled Tungsten mode ...

2015-10-21 Thread Jerry Lam
Yes. The crazy thing about mesos running in fine grained mode is that there
is no way (correct me if I'm wrong) to set the number of cores per
executor. If one of my slaves on mesos has 32 cores, the fine grained mode
can allocate 32 cores on this executor for the job and if there are 32
tasks running on this executor at the same time, that is when the acquire
memory issue appears. Of course the 32 cores are dynamically allocated. So
mesos can take them back or put them in again depending on the cluster
utilization.

On Wed, Oct 21, 2015 at 5:13 PM, Reynold Xin <r...@databricks.com> wrote:

> Is this still Mesos fine grained mode?
>
>
> On Wed, Oct 21, 2015 at 1:16 PM, Jerry Lam <chiling...@gmail.com> wrote:
>
>> Hi guys,
>>
>> There is another memory issue. Not sure if this is related to Tungsten
>> this time because I have it disable (spark.sql.tungsten.enabled=false). It
>> happens more there are too many tasks running (300). I need to limit the
>> number of task to avoid this. The executor has 6G. Spark 1.5.1 is been used.
>>
>> Best Regards,
>>
>> Jerry
>>
>> org.apache.spark.SparkException: Task failed while writing rows.
>>  at 
>> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:393)
>>  at 
>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
>>  at 
>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
>>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>  at org.apache.spark.scheduler.Task.run(Task.scala:88)
>>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>  at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>  at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>  at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.io.IOException: Unable to acquire 67108864 bytes of memory
>>  at 
>> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.acquireNewPage(UnsafeExternalSorter.java:351)
>>  at 
>> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.(UnsafeExternalSorter.java:138)
>>  at 
>> org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.create(UnsafeExternalSorter.java:106)
>>  at 
>> org.apache.spark.sql.execution.UnsafeKVExternalSorter.(UnsafeKVExternalSorter.java:74)
>>  at 
>> org.apache.spark.sql.execution.UnsafeKVExternalSorter.(UnsafeKVExternalSorter.java:56)
>>  at 
>> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:339)
>>
>>
>> On Tue, Oct 20, 2015 at 9:10 PM, Reynold Xin <r...@databricks.com> wrote:
>>
>>> With Jerry's permission, sending this back to the dev list to close the
>>> loop.
>>>
>>>
>>> -- Forwarded message --
>>> From: Jerry Lam <chiling...@gmail.com>
>>> Date: Tue, Oct 20, 2015 at 3:54 PM
>>> Subject: Re: If you use Spark 1.5 and disabled Tungsten mode ...
>>> To: Reynold Xin <r...@databricks.com>
>>>
>>>
>>> Yup, coarse grained mode works just fine. :)
>>> The difference is that by default, coarse grained mode uses 1 core per
>>> task. If I constraint 20 cores in total, there can be only 20 tasks running
>>> at the same time. However, with fine grained, I cannot set the total number
>>> of cores and therefore, it could be +200 tasks running at the same time (It
>>> is dynamic). So it might be the calculation of how much memory to acquire
>>> fail when the number of cores cannot be known ahead of time because you
>>> cannot make the assumption that X tasks running in an executor? Just my
>>> guess...
>>>
>>>
>>> On Tue, Oct 20, 2015 at 6:24 PM, Reynold Xin <r...@databricks.com>
>>> wrote:
>>>
>>>> Can you try coarse-grained mode and see if it is the same?
>>>>
>>>>
>>>> On Tue, Oct 20, 2015 at 3:20 PM, Jerry Lam <chiling...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Reynold,
>>>>>
>>>>> Yes, I'm using 1.5.1. I see them quite often. Sometimes it recovers
>>>>> but sometimes it does not. For one particular job, it failed all the time
>>>>>

Re: If you use Spark 1.5 and disabled Tungsten mode ...

2015-10-20 Thread Jerry Lam
I disabled it because of the "Could not acquire 65536 bytes of memory". It
happens to fail the job. So for now, I'm not touching it.

On Tue, Oct 20, 2015 at 4:48 PM, charmee  wrote:

> We had disabled tungsten after we found few performance issues, but had to
> enable it back because we found that when we had large number of group by
> fields, if tungsten is disabled the shuffle keeps failing.
>
> Here is an excerpt from one of our engineers with his analysis.
>
> With Tungsten Enabled (default in spark 1.5):
> ~90 files of 0.5G each:
>
> Ingest (after applying broadcast lookups) : 54 min
> Aggregation (~30 fields in group by and another 40 in aggregation) : 18 min
>
> With Tungsten Disabled:
>
> Ingest : 30 min
> Aggregation : Erroring out
>
> On smaller tests we found that joins are slow with tungsten enabled. With
> GROUP BY, disabling tungsten is not working in the first place.
>
> Hope this helps.
>
> -Charmee
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/If-you-use-Spark-1-5-and-disabled-Tungsten-mode-tp14604p14711.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: If you use Spark 1.5 and disabled Tungsten mode ...

2015-10-20 Thread Reynold Xin
Jerry - I think that's been fixed in 1.5.1. Do you still see it?

On Tue, Oct 20, 2015 at 2:11 PM, Jerry Lam  wrote:

> I disabled it because of the "Could not acquire 65536 bytes of memory". It
> happens to fail the job. So for now, I'm not touching it.
>
> On Tue, Oct 20, 2015 at 4:48 PM, charmee  wrote:
>
>> We had disabled tungsten after we found few performance issues, but had to
>> enable it back because we found that when we had large number of group by
>> fields, if tungsten is disabled the shuffle keeps failing.
>>
>> Here is an excerpt from one of our engineers with his analysis.
>>
>> With Tungsten Enabled (default in spark 1.5):
>> ~90 files of 0.5G each:
>>
>> Ingest (after applying broadcast lookups) : 54 min
>> Aggregation (~30 fields in group by and another 40 in aggregation) : 18
>> min
>>
>> With Tungsten Disabled:
>>
>> Ingest : 30 min
>> Aggregation : Erroring out
>>
>> On smaller tests we found that joins are slow with tungsten enabled. With
>> GROUP BY, disabling tungsten is not working in the first place.
>>
>> Hope this helps.
>>
>> -Charmee
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/If-you-use-Spark-1-5-and-disabled-Tungsten-mode-tp14604p14711.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>
>


Re: If you use Spark 1.5 and disabled Tungsten mode ...

2015-10-20 Thread charmee
We had disabled tungsten after we found few performance issues, but had to
enable it back because we found that when we had large number of group by
fields, if tungsten is disabled the shuffle keeps failing. 

Here is an excerpt from one of our engineers with his analysis. 

With Tungsten Enabled (default in spark 1.5): 
~90 files of 0.5G each: 

Ingest (after applying broadcast lookups) : 54 min 
Aggregation (~30 fields in group by and another 40 in aggregation) : 18 min 

With Tungsten Disabled: 

Ingest : 30 min 
Aggregation : Erroring out 

On smaller tests we found that joins are slow with tungsten enabled. With
GROUP BY, disabling tungsten is not working in the first place. 

Hope this helps. 

-Charmee



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/If-you-use-Spark-1-5-and-disabled-Tungsten-mode-tp14604p14711.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: If you use Spark 1.5 and disabled Tungsten mode ...

2015-10-20 Thread Jerry Lam
Hi Reynold,

Yes, I'm using 1.5.1. I see them quite often. Sometimes it recovers but
sometimes it does not. For one particular job, it failed all the time with
the acquire-memory issue. I'm using spark on mesos with fine grained mode.
Does it make a difference?

Best Regards,

Jerry

On Tue, Oct 20, 2015 at 5:27 PM, Reynold Xin  wrote:

> Jerry - I think that's been fixed in 1.5.1. Do you still see it?
>
> On Tue, Oct 20, 2015 at 2:11 PM, Jerry Lam  wrote:
>
>> I disabled it because of the "Could not acquire 65536 bytes of memory".
>> It happens to fail the job. So for now, I'm not touching it.
>>
>> On Tue, Oct 20, 2015 at 4:48 PM, charmee  wrote:
>>
>>> We had disabled tungsten after we found few performance issues, but had
>>> to
>>> enable it back because we found that when we had large number of group by
>>> fields, if tungsten is disabled the shuffle keeps failing.
>>>
>>> Here is an excerpt from one of our engineers with his analysis.
>>>
>>> With Tungsten Enabled (default in spark 1.5):
>>> ~90 files of 0.5G each:
>>>
>>> Ingest (after applying broadcast lookups) : 54 min
>>> Aggregation (~30 fields in group by and another 40 in aggregation) : 18
>>> min
>>>
>>> With Tungsten Disabled:
>>>
>>> Ingest : 30 min
>>> Aggregation : Erroring out
>>>
>>> On smaller tests we found that joins are slow with tungsten enabled. With
>>> GROUP BY, disabling tungsten is not working in the first place.
>>>
>>> Hope this helps.
>>>
>>> -Charmee
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-developers-list.1001551.n3.nabble.com/If-you-use-Spark-1-5-and-disabled-Tungsten-mode-tp14604p14711.html
>>> Sent from the Apache Spark Developers List mailing list archive at
>>> Nabble.com.
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: dev-h...@spark.apache.org
>>>
>>>
>>
>


Fwd: If you use Spark 1.5 and disabled Tungsten mode ...

2015-10-20 Thread Reynold Xin
With Jerry's permission, sending this back to the dev list to close the
loop.


-- Forwarded message --
From: Jerry Lam <chiling...@gmail.com>
Date: Tue, Oct 20, 2015 at 3:54 PM
Subject: Re: If you use Spark 1.5 and disabled Tungsten mode ...
To: Reynold Xin <r...@databricks.com>


Yup, coarse grained mode works just fine. :)
The difference is that by default, coarse grained mode uses 1 core per
task. If I constraint 20 cores in total, there can be only 20 tasks running
at the same time. However, with fine grained, I cannot set the total number
of cores and therefore, it could be +200 tasks running at the same time (It
is dynamic). So it might be the calculation of how much memory to acquire
fail when the number of cores cannot be known ahead of time because you
cannot make the assumption that X tasks running in an executor? Just my
guess...


On Tue, Oct 20, 2015 at 6:24 PM, Reynold Xin <r...@databricks.com> wrote:

> Can you try coarse-grained mode and see if it is the same?
>
>
> On Tue, Oct 20, 2015 at 3:20 PM, Jerry Lam <chiling...@gmail.com> wrote:
>
>> Hi Reynold,
>>
>> Yes, I'm using 1.5.1. I see them quite often. Sometimes it recovers but
>> sometimes it does not. For one particular job, it failed all the time with
>> the acquire-memory issue. I'm using spark on mesos with fine grained mode.
>> Does it make a difference?
>>
>> Best Regards,
>>
>> Jerry
>>
>> On Tue, Oct 20, 2015 at 5:27 PM, Reynold Xin <r...@databricks.com> wrote:
>>
>>> Jerry - I think that's been fixed in 1.5.1. Do you still see it?
>>>
>>> On Tue, Oct 20, 2015 at 2:11 PM, Jerry Lam <chiling...@gmail.com> wrote:
>>>
>>>> I disabled it because of the "Could not acquire 65536 bytes of memory".
>>>> It happens to fail the job. So for now, I'm not touching it.
>>>>
>>>> On Tue, Oct 20, 2015 at 4:48 PM, charmee <charm...@gmail.com> wrote:
>>>>
>>>>> We had disabled tungsten after we found few performance issues, but
>>>>> had to
>>>>> enable it back because we found that when we had large number of group
>>>>> by
>>>>> fields, if tungsten is disabled the shuffle keeps failing.
>>>>>
>>>>> Here is an excerpt from one of our engineers with his analysis.
>>>>>
>>>>> With Tungsten Enabled (default in spark 1.5):
>>>>> ~90 files of 0.5G each:
>>>>>
>>>>> Ingest (after applying broadcast lookups) : 54 min
>>>>> Aggregation (~30 fields in group by and another 40 in aggregation) :
>>>>> 18 min
>>>>>
>>>>> With Tungsten Disabled:
>>>>>
>>>>> Ingest : 30 min
>>>>> Aggregation : Erroring out
>>>>>
>>>>> On smaller tests we found that joins are slow with tungsten enabled.
>>>>> With
>>>>> GROUP BY, disabling tungsten is not working in the first place.
>>>>>
>>>>> Hope this helps.
>>>>>
>>>>> -Charmee
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://apache-spark-developers-list.1001551.n3.nabble.com/If-you-use-Spark-1-5-and-disabled-Tungsten-mode-tp14604p14711.html
>>>>> Sent from the Apache Spark Developers List mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>> -
>>>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>>>> For additional commands, e-mail: dev-h...@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>>
>


Re: If you use Spark 1.5 and disabled Tungsten mode ...

2015-10-15 Thread Josh Rosen
To clarify, we're asking about the *spark.sql.tungsten.enabled* flag, which
was introduced in Spark 1.5 and enables Project Tungsten optimizations in
Spark SQL. This option is set to *true* by default in Spark 1.5+ and exists
primarily to allow users to disable the new code paths if they encounter
bugs or performance regressions.

If anyone sets spark.sql.tungsten.enabled=*false *in their SparkConf in
order to *disable* these optimizations, we'd like to hear from you in order
to figure out why you disabled them and to see whether we can make
improvements to allow your workload to run with Tungsten enabled.

Thanks,
Josh

On Thu, Oct 15, 2015 at 9:33 AM, mkhaitman  wrote:

> Are you referring to spark.shuffle.manager=tungsten-sort? If so, we saw the
> default value as still being as the regular sort, and since it was only
> first introduced in 1.5, were actually waiting a bit to see if anyone
> ENABLED it as opposed to DISABLING it since - it's disabled by default! :)
>
> I recall enabling it during testing within our dev environment, but didn't
> have a comparable workload and environment to our production cluster, so we
> were going to play it safe and wait until 1.6 in case there were any major
> changes / regressions that weren't seen during 1.5 testing!
>
> Mark.
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/If-you-use-Spark-1-5-and-disabled-Tungsten-mode-tp14604p14627.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: If you use Spark 1.5 and disabled Tungsten mode ...

2015-10-15 Thread mkhaitman
My apologies for mixing up what was being referred to in that case! :)

Mark.





--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/If-you-use-Spark-1-5-and-disabled-Tungsten-mode-tp14604p14629.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: If you use Spark 1.5 and disabled Tungsten mode ...

2015-10-15 Thread mkhaitman
Are you referring to spark.shuffle.manager=tungsten-sort? If so, we saw the
default value as still being as the regular sort, and since it was only
first introduced in 1.5, were actually waiting a bit to see if anyone
ENABLED it as opposed to DISABLING it since - it's disabled by default! :)

I recall enabling it during testing within our dev environment, but didn't
have a comparable workload and environment to our production cluster, so we
were going to play it safe and wait until 1.6 in case there were any major
changes / regressions that weren't seen during 1.5 testing!

Mark.



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/If-you-use-Spark-1-5-and-disabled-Tungsten-mode-tp14604p14627.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org